Netdev List
 help / color / mirror / Atom feed
* Re: [PATCH v5] Gemini: Gigabit ethernet driver
From: Michał Mirosław @ 2011-02-05 13:19 UTC (permalink / raw)
  To: netdev, linux-kernel; +Cc: David Miller
In-Reply-To: <20110202171814.GA10458@rere.qmqm.pl>

On Wed, Feb 02, 2011 at 06:18:14PM +0100, Michał Mirosław wrote:
> If I make it buildable (NOT working) on other archs is it enough I add
> Kconfig dependency on (ARCH_GEMINI || BROKEN) to allow it to be
> build-tested there?

Hmm. CONFIG_BROKEN is not easily selectable anymore, so it's unlikely
that many people will even know about it. What's the correct way to
make a driver be built but not installed by default? I can use:

default m if ARCH_GEMINI
default n

But that will make it get installed on x86 allmodconfig even if it
won't ever work there - so will pollute modules for other arches
unless distribution managers remember that not every driver built
needs to be included.

Best Regards,
Michał Mirosław

^ permalink raw reply

* Re: [PATCH] tcp: Increase the initial congestion window to 10.
From: Jerry Chu @ 2011-02-05  3:39 UTC (permalink / raw)
  To: Ilpo Järvinen; +Cc: David Miller, Netdev, therbert
In-Reply-To: <alpine.DEB.2.00.1102042126440.28937@melkinpaasi.cs.helsinki.fi>

On Fri, Feb 4, 2011 at 11:43 AM, Ilpo Järvinen
<ilpo.jarvinen@helsinki.fi> wrote:
>
> On Thu, 3 Feb 2011, H.K. Jerry Chu wrote:
>
> > On Thu, Feb 3, 2011 at 2:43 PM, Ilpo Järvinen <ilpo.jarvinen@helsinki.fi> wrote:
> > > It would perhaps be useful to change receiver advertized window to include
> > > some extra segs initially. It should be >= IW + peer's dupThresh-1 as
> > > otherwise limited transmit won't work for the initial window because we
> > > won't open more receiver window with dupacks (IIRC, I suppose Jerry might
> > > be able to correct me right away if I'm wrong and we open window with
> > > dupacks too?).
> >
> > Sorry I don't know how the receive window is updated in Linux,
> > autotuning or not.
> > But I just wonder why would it have to do with dupacks, i.e., why would
> > it not slide forward as long as the left edge of the window slides
> > forward, regardless of OOO pkt arrival?
>
> ?? DupACK by defination does not slide the left edge?!? :-) ...It
> certainly makes a difference whether the ACK is cumulative or not.
> Anyway, I tcpdumped it now and confirmed that advertized window is not
> advanced if OOO packet arrives.

Cwnd discounts packets that have left the network but rwnd won't discharge
packets until they are consumed by ULP so you are right that in case
the packets from or near the head of the retransmission queue get
dropped rwnd won't
open up room for more packets even though cwnd will. To cover that case initrwnd
needs to be larger than initcwnd.

Jerry

>
> > I am of the opinion that rwnd is for flow control purpose only thus should be
> > fully decoupled from the cwnd of the other (sender) side. Therefore
> > initrwnd should
> > normally be based on projected BDP and local memory pressure, e.g., 64KB, not
> > bearing any relation with IW of the other side. Only under special
> > circumstances should it be used to constrain the sender, e.g., for
> > devices behind slow links with
> > very small buffer.
>
> I also think along the lines that the advertized window autotuning code
> is just unnecessarily preventive (besides the IW change, also Quickstart
> couldn't be used that efficiently because of it).
>
> --
>  i.

^ permalink raw reply

* Re: [PATCH] ServerEngines, benet: Avoid potential null deref in be_cmd_get_seeprom_data()
From: Ajit Khaparde @ 2011-02-05  3:18 UTC (permalink / raw)
  To: netdev, jj; +Cc: linux-kernel

> From: Jesper Juhl [jj@chaosbits.net]
> ent: Thursday, February 03, 2011 3:27 PM
> To: netdev@vger.kernel.org
> Cc: linux-drivers; linux-kernel@vger.kernel.org; Khaparde, Ajit; Bandi, Sarveshwar; Seetharaman, Subramanian; Perla, Sathya
> Subject: [PATCH] ServerEngines, benet: Avoid potential null deref in be_cmd_get_seeprom_data()

> wrb_from_mccq() may return null, so we may crash on a null deref in
> be_cmd_get_seeprom_data().
> This avoids that potential crash.

> Signed-off-by: Jesper Juhl <jj@chaosbits.net>

Thanks Jesper.
But because we have acquired a lock, we need to release it.
I would suggest considering the following patch.
 
---

[PATCH] ServerEngines, benet: Avoid potential null deref in be_cmd_get_seeprom_data()

Found by: Jesper Juhl <jj@chaosbits.net>

Signed-off-by: Ajit Khaparde <ajit.khaparde@emulex.com>
---
 drivers/net/benet/be_cmds.c |    5 +++++
 1 files changed, 5 insertions(+), 0 deletions(-)

diff --git a/drivers/net/benet/be_cmds.c b/drivers/net/benet/be_cmds.c
index 0c7811f..a179cc6 100644
--- a/drivers/net/benet/be_cmds.c
+++ b/drivers/net/benet/be_cmds.c
@@ -1786,6 +1786,10 @@ int be_cmd_get_seeprom_data(struct be_adapter *adapter,
 	spin_lock_bh(&adapter->mcc_lock);
 
 	wrb = wrb_from_mccq(adapter);
+	if (!wrb) {
+		status = -EBUSY;
+		goto err;
+	}
 	req = nonemb_cmd->va;
 	sge = nonembedded_sgl(wrb);
 
@@ -1801,6 +1805,7 @@ int be_cmd_get_seeprom_data(struct be_adapter *adapter,
 
 	status = be_mcc_notify_wait(adapter);
 
+err:
 	spin_unlock_bh(&adapter->mcc_lock);
 	return status;
 }
-- 
1.7.1

^ permalink raw reply related

* [net-next-2.6 PATCH 5/5] enic: Update MAINTAINERS
From: Vasanthy Kolluri @ 2011-02-05  2:17 UTC (permalink / raw)
  To: davem; +Cc: netdev
In-Reply-To: <20110205021618.8510.34816.stgit@savbu-pc100.cisco.com>

From: Vasanthy Kolluri <vkolluri@cisco.com>



Signed-off-by: Christian Benvenuti <benve@cisco.com>
Signed-off-by: Vasanthy Kolluri <vkolluri@cisco.com>
Signed-off-by: Roopa Prabhu <roprabhu@cisco.com>
Signed-off-by: David Wang <dwang2@cisco.com>
---
 MAINTAINERS |    1 +
 1 files changed, 1 insertions(+), 0 deletions(-)


diff --git a/MAINTAINERS b/MAINTAINERS
index 424887b..1d2abfa 100644
--- a/MAINTAINERS
+++ b/MAINTAINERS
@@ -1691,6 +1691,7 @@ S:	Supported
 F:	scripts/checkpatch.pl
 
 CISCO VIC ETHERNET NIC DRIVER
+M:	Christian Benvenuti <benve@cisco.com>
 M:	Vasanthy Kolluri <vkolluri@cisco.com>
 M:	Roopa Prabhu <roprabhu@cisco.com>
 M:	David Wang <dwang2@cisco.com>


^ permalink raw reply related

* [net-next-2.6 PATCH 4/5] enic: Clean up: Remove support for an older version of hardware
From: Vasanthy Kolluri @ 2011-02-05  2:17 UTC (permalink / raw)
  To: davem; +Cc: netdev
In-Reply-To: <20110205021618.8510.34816.stgit@savbu-pc100.cisco.com>

From: Vasanthy Kolluri <vkolluri@cisco.com>

Remove support for an older version (A1) of hardware

Signed-off-by: Christian Benvenuti <benve@cisco.com>
Signed-off-by: Vasanthy Kolluri <vkolluri@cisco.com>
Signed-off-by: Roopa Prabhu <roprabhu@cisco.com>
Signed-off-by: David Wang <dwang2@cisco.com>
---
 drivers/net/enic/enic.h      |    3 +-
 drivers/net/enic/enic_dev.c  |   11 --------
 drivers/net/enic/enic_dev.h  |    1 -
 drivers/net/enic/enic_main.c |   56 ++----------------------------------------
 drivers/net/enic/vnic_dev.c  |   19 --------------
 drivers/net/enic/vnic_dev.h  |    8 ------
 drivers/net/enic/vnic_rq.h   |    5 ----
 7 files changed, 4 insertions(+), 99 deletions(-)


diff --git a/drivers/net/enic/enic.h b/drivers/net/enic/enic.h
index 7316267..57fcaee 100644
--- a/drivers/net/enic/enic.h
+++ b/drivers/net/enic/enic.h
@@ -32,7 +32,7 @@
 
 #define DRV_NAME		"enic"
 #define DRV_DESCRIPTION		"Cisco VIC Ethernet NIC Driver"
-#define DRV_VERSION		"2.1.1.5"
+#define DRV_VERSION		"2.1.1.6"
 #define DRV_COPYRIGHT		"Copyright 2008-2011 Cisco Systems, Inc"
 
 #define ENIC_BARS_MAX		6
@@ -101,7 +101,6 @@ struct enic {
 	/* receive queue cache line section */
 	____cacheline_aligned struct vnic_rq rq[ENIC_RQ_MAX];
 	unsigned int rq_count;
-	int (*rq_alloc_buf)(struct vnic_rq *rq);
 	u64 rq_truncated_pkts;
 	u64 rq_bad_fcs;
 	struct napi_struct napi[ENIC_RQ_MAX];
diff --git a/drivers/net/enic/enic_dev.c b/drivers/net/enic/enic_dev.c
index 3826266..37ad3a1 100644
--- a/drivers/net/enic/enic_dev.c
+++ b/drivers/net/enic/enic_dev.c
@@ -110,17 +110,6 @@ int enic_dev_del_addr(struct enic *enic, u8 *addr)
 	return err;
 }
 
-int enic_dev_hw_version(struct enic *enic, enum vnic_dev_hw_version *hw_ver)
-{
-	int err;
-
-	spin_lock(&enic->devcmd_lock);
-	err = vnic_dev_hw_version(enic->vdev, hw_ver);
-	spin_unlock(&enic->devcmd_lock);
-
-	return err;
-}
-
 int enic_dev_notify_unset(struct enic *enic)
 {
 	int err;
diff --git a/drivers/net/enic/enic_dev.h b/drivers/net/enic/enic_dev.h
index 3ac6ba1..495f57f 100644
--- a/drivers/net/enic/enic_dev.h
+++ b/drivers/net/enic/enic_dev.h
@@ -29,7 +29,6 @@ int enic_dev_add_addr(struct enic *enic, u8 *addr);
 int enic_dev_del_addr(struct enic *enic, u8 *addr);
 void enic_vlan_rx_add_vid(struct net_device *netdev, u16 vid);
 void enic_vlan_rx_kill_vid(struct net_device *netdev, u16 vid);
-int enic_dev_hw_version(struct enic *enic, enum vnic_dev_hw_version *hw_ver);
 int enic_dev_notify_unset(struct enic *enic);
 int enic_dev_hang_notify(struct enic *enic);
 int enic_dev_set_ig_vlan_rewrite_mode(struct enic *enic);
diff --git a/drivers/net/enic/enic_main.c b/drivers/net/enic/enic_main.c
index d6cdecc..0c24370 100644
--- a/drivers/net/enic/enic_main.c
+++ b/drivers/net/enic/enic_main.c
@@ -1348,50 +1348,6 @@ static int enic_rq_alloc_buf(struct vnic_rq *rq)
 	return 0;
 }
 
-static int enic_rq_alloc_buf_a1(struct vnic_rq *rq)
-{
-	struct rq_enet_desc *desc = vnic_rq_next_desc(rq);
-
-	if (vnic_rq_posting_soon(rq)) {
-
-		/* SW workaround for A0 HW erratum: if we're just about
-		 * to write posted_index, insert a dummy desc
-		 * of type resvd
-		 */
-
-		rq_enet_desc_enc(desc, 0, RQ_ENET_TYPE_RESV2, 0);
-		vnic_rq_post(rq, 0, 0, 0, 0);
-	} else {
-		return enic_rq_alloc_buf(rq);
-	}
-
-	return 0;
-}
-
-static int enic_set_rq_alloc_buf(struct enic *enic)
-{
-	enum vnic_dev_hw_version hw_ver;
-	int err;
-
-	err = enic_dev_hw_version(enic, &hw_ver);
-	if (err)
-		return err;
-
-	switch (hw_ver) {
-	case VNIC_DEV_HW_VER_A1:
-		enic->rq_alloc_buf = enic_rq_alloc_buf_a1;
-		break;
-	case VNIC_DEV_HW_VER_A2:
-	case VNIC_DEV_HW_VER_UNKNOWN:
-		enic->rq_alloc_buf = enic_rq_alloc_buf;
-		break;
-	default:
-		return -ENODEV;
-	}
-
-	return 0;
-}
-
 static void enic_rq_indicate_buf(struct vnic_rq *rq,
 	struct cq_desc *cq_desc, struct vnic_rq_buf *buf,
 	int skipped, void *opaque)
@@ -1528,7 +1484,7 @@ static int enic_poll(struct napi_struct *napi, int budget)
 			0 /* don't unmask intr */,
 			0 /* don't reset intr timer */);
 
-	err = vnic_rq_fill(&enic->rq[0], enic->rq_alloc_buf);
+	err = vnic_rq_fill(&enic->rq[0], enic_rq_alloc_buf);
 
 	/* Buffer allocation failed. Stay in polling
 	 * mode so we can try to fill the ring again.
@@ -1578,7 +1534,7 @@ static int enic_poll_msix(struct napi_struct *napi, int budget)
 			0 /* don't unmask intr */,
 			0 /* don't reset intr timer */);
 
-	err = vnic_rq_fill(&enic->rq[rq], enic->rq_alloc_buf);
+	err = vnic_rq_fill(&enic->rq[rq], enic_rq_alloc_buf);
 
 	/* Buffer allocation failed. Stay in polling mode
 	 * so we can try to fill the ring again.
@@ -1781,7 +1737,7 @@ static int enic_open(struct net_device *netdev)
 	}
 
 	for (i = 0; i < enic->rq_count; i++) {
-		vnic_rq_fill(&enic->rq[i], enic->rq_alloc_buf);
+		vnic_rq_fill(&enic->rq[i], enic_rq_alloc_buf);
 		/* Need at least one buffer on ring to get going */
 		if (vnic_rq_desc_used(&enic->rq[i]) == 0) {
 			netdev_err(netdev, "Unable to alloc receive buffers\n");
@@ -2347,12 +2303,6 @@ static int enic_dev_init(struct enic *enic)
 
 	enic_init_vnic_resources(enic);
 
-	err = enic_set_rq_alloc_buf(enic);
-	if (err) {
-		dev_err(dev, "Failed to set RQ buffer allocator, aborting\n");
-		goto err_out_free_vnic_resources;
-	}
-
 	err = enic_set_rss_nic_cfg(enic);
 	if (err) {
 		dev_err(dev, "Failed to config nic, aborting\n");
diff --git a/drivers/net/enic/vnic_dev.c b/drivers/net/enic/vnic_dev.c
index fb35d8b..c489e72 100644
--- a/drivers/net/enic/vnic_dev.c
+++ b/drivers/net/enic/vnic_dev.c
@@ -419,25 +419,6 @@ int vnic_dev_fw_info(struct vnic_dev *vdev,
 	return err;
 }
 
-int vnic_dev_hw_version(struct vnic_dev *vdev, enum vnic_dev_hw_version *hw_ver)
-{
-	struct vnic_devcmd_fw_info *fw_info;
-	int err;
-
-	err = vnic_dev_fw_info(vdev, &fw_info);
-	if (err)
-		return err;
-
-	if (strncmp(fw_info->hw_version, "A1", sizeof("A1")) == 0)
-		*hw_ver = VNIC_DEV_HW_VER_A1;
-	else if (strncmp(fw_info->hw_version, "A2", sizeof("A2")) == 0)
-		*hw_ver = VNIC_DEV_HW_VER_A2;
-	else
-		*hw_ver = VNIC_DEV_HW_VER_UNKNOWN;
-
-	return 0;
-}
-
 int vnic_dev_spec(struct vnic_dev *vdev, unsigned int offset, unsigned int size,
 	void *value)
 {
diff --git a/drivers/net/enic/vnic_dev.h b/drivers/net/enic/vnic_dev.h
index 05f9a24..e837546 100644
--- a/drivers/net/enic/vnic_dev.h
+++ b/drivers/net/enic/vnic_dev.h
@@ -44,12 +44,6 @@ static inline void writeq(u64 val, void __iomem *reg)
 #undef pr_fmt
 #define pr_fmt(fmt) KBUILD_MODNAME ": " fmt
 
-enum vnic_dev_hw_version {
-	VNIC_DEV_HW_VER_UNKNOWN,
-	VNIC_DEV_HW_VER_A1,
-	VNIC_DEV_HW_VER_A2,
-};
-
 enum vnic_dev_intr_mode {
 	VNIC_DEV_INTR_MODE_UNKNOWN,
 	VNIC_DEV_INTR_MODE_INTX,
@@ -93,8 +87,6 @@ int vnic_dev_cmd(struct vnic_dev *vdev, enum vnic_devcmd_cmd cmd,
 	u64 *a0, u64 *a1, int wait);
 int vnic_dev_fw_info(struct vnic_dev *vdev,
 	struct vnic_devcmd_fw_info **fw_info);
-int vnic_dev_hw_version(struct vnic_dev *vdev,
-	enum vnic_dev_hw_version *hw_ver);
 int vnic_dev_spec(struct vnic_dev *vdev, unsigned int offset, unsigned int size,
 	void *value);
 int vnic_dev_stats_dump(struct vnic_dev *vdev, struct vnic_stats **stats);
diff --git a/drivers/net/enic/vnic_rq.h b/drivers/net/enic/vnic_rq.h
index 37f08de..2056586 100644
--- a/drivers/net/enic/vnic_rq.h
+++ b/drivers/net/enic/vnic_rq.h
@@ -141,11 +141,6 @@ static inline void vnic_rq_post(struct vnic_rq *rq,
 	}
 }
 
-static inline int vnic_rq_posting_soon(struct vnic_rq *rq)
-{
-	return (rq->to_use->index & VNIC_RQ_RETURN_RATE) == 0;
-}
-
 static inline void vnic_rq_return_descs(struct vnic_rq *rq, unsigned int count)
 {
 	rq->ring.desc_avail += count;


^ permalink raw reply related

* [net-next-2.6 PATCH 3/5] enic: Bug Fix: Reorder firmware devcmds - CMD_INIT and CMD_IG_VLAN_REWRITE_MODE
From: Vasanthy Kolluri @ 2011-02-05  2:17 UTC (permalink / raw)
  To: davem; +Cc: netdev
In-Reply-To: <20110205021618.8510.34816.stgit@savbu-pc100.cisco.com>

From: Vasanthy Kolluri <vkolluri@cisco.com>

Firmware requires CMD_IG_VLAN_REWRITE_MODE be issued before a CMD_INIT.

Signed-off-by: Christian Benvenuti <benve@cisco.com>
Signed-off-by: Vasanthy Kolluri <vkolluri@cisco.com>
Signed-off-by: Roopa Prabhu <roprabhu@cisco.com>
Signed-off-by: David Wang <dwang2@cisco.com>
---
 drivers/net/enic/enic.h      |    2 +-
 drivers/net/enic/enic_main.c |   28 ++++++++++++++++------------
 2 files changed, 17 insertions(+), 13 deletions(-)


diff --git a/drivers/net/enic/enic.h b/drivers/net/enic/enic.h
index f38ad63..7316267 100644
--- a/drivers/net/enic/enic.h
+++ b/drivers/net/enic/enic.h
@@ -32,7 +32,7 @@
 
 #define DRV_NAME		"enic"
 #define DRV_DESCRIPTION		"Cisco VIC Ethernet NIC Driver"
-#define DRV_VERSION		"2.1.1.4"
+#define DRV_VERSION		"2.1.1.5"
 #define DRV_COPYRIGHT		"Copyright 2008-2011 Cisco Systems, Inc"
 
 #define ENIC_BARS_MAX		6
diff --git a/drivers/net/enic/enic_main.c b/drivers/net/enic/enic_main.c
index 3893370..d6cdecc 100644
--- a/drivers/net/enic/enic_main.c
+++ b/drivers/net/enic/enic_main.c
@@ -2359,13 +2359,6 @@ static int enic_dev_init(struct enic *enic)
 		goto err_out_free_vnic_resources;
 	}
 
-	err = enic_dev_set_ig_vlan_rewrite_mode(enic);
-	if (err) {
-		dev_err(dev,
-			"Failed to set ingress vlan rewrite mode, aborting.\n");
-		goto err_out_free_vnic_resources;
-	}
-
 	switch (vnic_dev_get_intr_mode(enic->vdev)) {
 	default:
 		netif_napi_add(netdev, &enic->napi[0], enic_poll, 64);
@@ -2504,6 +2497,22 @@ static int __devinit enic_probe(struct pci_dev *pdev,
 		goto err_out_vnic_unregister;
 	}
 
+	/* Setup devcmd lock
+	 */
+
+	spin_lock_init(&enic->devcmd_lock);
+
+	/*
+	 * Set ingress vlan rewrite mode before vnic initialization
+	 */
+
+	err = enic_dev_set_ig_vlan_rewrite_mode(enic);
+	if (err) {
+		dev_err(dev,
+			"Failed to set ingress vlan rewrite mode, aborting.\n");
+		goto err_out_dev_close;
+	}
+
 	/* Issue device init to initialize the vnic-to-switch link.
 	 * We'll start with carrier off and wait for link UP
 	 * notification later to turn on carrier.  We don't need
@@ -2527,11 +2536,6 @@ static int __devinit enic_probe(struct pci_dev *pdev,
 		}
 	}
 
-	/* Setup devcmd lock
-	 */
-
-	spin_lock_init(&enic->devcmd_lock);
-
 	err = enic_dev_init(enic);
 	if (err) {
 		dev_err(dev, "Device initialization failed, aborting\n");


^ permalink raw reply related

* [net-next-2.6 PATCH 2/5] enic: Bug Fix: Fix return values of enic_add/del_station_addr routines
From: Vasanthy Kolluri @ 2011-02-05  2:17 UTC (permalink / raw)
  To: davem; +Cc: netdev
In-Reply-To: <20110205021618.8510.34816.stgit@savbu-pc100.cisco.com>

From: Vasanthy Kolluri <vkolluri@cisco.com>

Fix enic_add/del_station_addr routines to return appropriate error code when an invalid address is added or deleted.

Signed-off-by: Christian Benvenuti <benve@cisco.com>
Signed-off-by: Vasanthy Kolluri <vkolluri@cisco.com>
Signed-off-by: Roopa Prabhu <roprabhu@cisco.com>
Signed-off-by: David Wang <dwang2@cisco.com>
---
 drivers/net/enic/enic.h     |    2 +-
 drivers/net/enic/enic_dev.c |   26 ++++++++++++++------------
 2 files changed, 15 insertions(+), 13 deletions(-)


diff --git a/drivers/net/enic/enic.h b/drivers/net/enic/enic.h
index 1385a60..f38ad63 100644
--- a/drivers/net/enic/enic.h
+++ b/drivers/net/enic/enic.h
@@ -32,7 +32,7 @@
 
 #define DRV_NAME		"enic"
 #define DRV_DESCRIPTION		"Cisco VIC Ethernet NIC Driver"
-#define DRV_VERSION		"2.1.1.3"
+#define DRV_VERSION		"2.1.1.4"
 #define DRV_COPYRIGHT		"Copyright 2008-2011 Cisco Systems, Inc"
 
 #define ENIC_BARS_MAX		6
diff --git a/drivers/net/enic/enic_dev.c b/drivers/net/enic/enic_dev.c
index a52dbd2..3826266 100644
--- a/drivers/net/enic/enic_dev.c
+++ b/drivers/net/enic/enic_dev.c
@@ -49,26 +49,28 @@ int enic_dev_stats_dump(struct enic *enic, struct vnic_stats **vstats)
 
 int enic_dev_add_station_addr(struct enic *enic)
 {
-	int err = 0;
+	int err;
+
+	if (!is_valid_ether_addr(enic->netdev->dev_addr))
+		return -EADDRNOTAVAIL;
 
-	if (is_valid_ether_addr(enic->netdev->dev_addr)) {
-		spin_lock(&enic->devcmd_lock);
-		err = vnic_dev_add_addr(enic->vdev, enic->netdev->dev_addr);
-		spin_unlock(&enic->devcmd_lock);
-	}
+	spin_lock(&enic->devcmd_lock);
+	err = vnic_dev_add_addr(enic->vdev, enic->netdev->dev_addr);
+	spin_unlock(&enic->devcmd_lock);
 
 	return err;
 }
 
 int enic_dev_del_station_addr(struct enic *enic)
 {
-	int err = 0;
+	int err;
+
+	if (!is_valid_ether_addr(enic->netdev->dev_addr))
+		return -EADDRNOTAVAIL;
 
-	if (is_valid_ether_addr(enic->netdev->dev_addr)) {
-		spin_lock(&enic->devcmd_lock);
-		err = vnic_dev_del_addr(enic->vdev, enic->netdev->dev_addr);
-		spin_unlock(&enic->devcmd_lock);
-	}
+	spin_lock(&enic->devcmd_lock);
+	err = vnic_dev_del_addr(enic->vdev, enic->netdev->dev_addr);
+	spin_unlock(&enic->devcmd_lock);
 
 	return err;
 }


^ permalink raw reply related

* [net-next-2.6 PATCH 1/5] enic: Clean up: Organize devcmd wrapper routines
From: Vasanthy Kolluri @ 2011-02-05  2:17 UTC (permalink / raw)
  To: davem; +Cc: netdev
In-Reply-To: <20110205021618.8510.34816.stgit@savbu-pc100.cisco.com>

From: Vasanthy Kolluri <vkolluri@cisco.com>

Organize the wrapper routines for firmware devcmds into a separate file.

Signed-off-by: Christian Benvenuti <benve@cisco.com>
Signed-off-by: Vasanthy Kolluri <vkolluri@cisco.com>
Signed-off-by: Roopa Prabhu <roprabhu@cisco.com>
Signed-off-by: David Wang <dwang2@cisco.com>
---
 drivers/net/enic/Makefile    |    2 
 drivers/net/enic/enic.h      |    2 
 drivers/net/enic/enic_dev.c  |  230 ++++++++++++++++++++++++++++++++++++++++++
 drivers/net/enic/enic_dev.h  |   42 ++++++++
 drivers/net/enic/enic_main.c |  207 --------------------------------------
 5 files changed, 275 insertions(+), 208 deletions(-)
 create mode 100644 drivers/net/enic/enic_dev.c
 create mode 100644 drivers/net/enic/enic_dev.h


diff --git a/drivers/net/enic/Makefile b/drivers/net/enic/Makefile
index e7b6c31..2e573be 100644
--- a/drivers/net/enic/Makefile
+++ b/drivers/net/enic/Makefile
@@ -1,5 +1,5 @@
 obj-$(CONFIG_ENIC) := enic.o
 
 enic-y := enic_main.o vnic_cq.o vnic_intr.o vnic_wq.o \
-	enic_res.o vnic_dev.o vnic_rq.o vnic_vic.o
+	enic_res.o enic_dev.o vnic_dev.o vnic_rq.o vnic_vic.o
 
diff --git a/drivers/net/enic/enic.h b/drivers/net/enic/enic.h
index 44865bb..1385a60 100644
--- a/drivers/net/enic/enic.h
+++ b/drivers/net/enic/enic.h
@@ -32,7 +32,7 @@
 
 #define DRV_NAME		"enic"
 #define DRV_DESCRIPTION		"Cisco VIC Ethernet NIC Driver"
-#define DRV_VERSION		"2.1.1.2a"
+#define DRV_VERSION		"2.1.1.3"
 #define DRV_COPYRIGHT		"Copyright 2008-2011 Cisco Systems, Inc"
 
 #define ENIC_BARS_MAX		6
diff --git a/drivers/net/enic/enic_dev.c b/drivers/net/enic/enic_dev.c
new file mode 100644
index 0000000..a52dbd2
--- /dev/null
+++ b/drivers/net/enic/enic_dev.c
@@ -0,0 +1,230 @@
+/*
+ * Copyright 2011 Cisco Systems, Inc.  All rights reserved.
+ *
+ * This program is free software; you may redistribute it and/or modify
+ * it under the terms of the GNU General Public License as published by
+ * the Free Software Foundation; version 2 of the License.
+ *
+ * THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND,
+ * EXPRESS OR IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF
+ * MERCHANTABILITY, FITNESS FOR A PARTICULAR PURPOSE AND
+ * NONINFRINGEMENT. IN NO EVENT SHALL THE AUTHORS OR COPYRIGHT HOLDERS
+ * BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER LIABILITY, WHETHER IN AN
+ * ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM, OUT OF OR IN
+ * CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE
+ * SOFTWARE.
+ *
+ */
+
+#include <linux/pci.h>
+#include <linux/etherdevice.h>
+
+#include "vnic_dev.h"
+#include "vnic_vic.h"
+#include "enic_res.h"
+#include "enic.h"
+#include "enic_dev.h"
+
+int enic_dev_fw_info(struct enic *enic, struct vnic_devcmd_fw_info **fw_info)
+{
+	int err;
+
+	spin_lock(&enic->devcmd_lock);
+	err = vnic_dev_fw_info(enic->vdev, fw_info);
+	spin_unlock(&enic->devcmd_lock);
+
+	return err;
+}
+
+int enic_dev_stats_dump(struct enic *enic, struct vnic_stats **vstats)
+{
+	int err;
+
+	spin_lock(&enic->devcmd_lock);
+	err = vnic_dev_stats_dump(enic->vdev, vstats);
+	spin_unlock(&enic->devcmd_lock);
+
+	return err;
+}
+
+int enic_dev_add_station_addr(struct enic *enic)
+{
+	int err = 0;
+
+	if (is_valid_ether_addr(enic->netdev->dev_addr)) {
+		spin_lock(&enic->devcmd_lock);
+		err = vnic_dev_add_addr(enic->vdev, enic->netdev->dev_addr);
+		spin_unlock(&enic->devcmd_lock);
+	}
+
+	return err;
+}
+
+int enic_dev_del_station_addr(struct enic *enic)
+{
+	int err = 0;
+
+	if (is_valid_ether_addr(enic->netdev->dev_addr)) {
+		spin_lock(&enic->devcmd_lock);
+		err = vnic_dev_del_addr(enic->vdev, enic->netdev->dev_addr);
+		spin_unlock(&enic->devcmd_lock);
+	}
+
+	return err;
+}
+
+int enic_dev_packet_filter(struct enic *enic, int directed, int multicast,
+	int broadcast, int promisc, int allmulti)
+{
+	int err;
+
+	spin_lock(&enic->devcmd_lock);
+	err = vnic_dev_packet_filter(enic->vdev, directed,
+		multicast, broadcast, promisc, allmulti);
+	spin_unlock(&enic->devcmd_lock);
+
+	return err;
+}
+
+int enic_dev_add_addr(struct enic *enic, u8 *addr)
+{
+	int err;
+
+	spin_lock(&enic->devcmd_lock);
+	err = vnic_dev_add_addr(enic->vdev, addr);
+	spin_unlock(&enic->devcmd_lock);
+
+	return err;
+}
+
+int enic_dev_del_addr(struct enic *enic, u8 *addr)
+{
+	int err;
+
+	spin_lock(&enic->devcmd_lock);
+	err = vnic_dev_del_addr(enic->vdev, addr);
+	spin_unlock(&enic->devcmd_lock);
+
+	return err;
+}
+
+int enic_dev_hw_version(struct enic *enic, enum vnic_dev_hw_version *hw_ver)
+{
+	int err;
+
+	spin_lock(&enic->devcmd_lock);
+	err = vnic_dev_hw_version(enic->vdev, hw_ver);
+	spin_unlock(&enic->devcmd_lock);
+
+	return err;
+}
+
+int enic_dev_notify_unset(struct enic *enic)
+{
+	int err;
+
+	spin_lock(&enic->devcmd_lock);
+	err = vnic_dev_notify_unset(enic->vdev);
+	spin_unlock(&enic->devcmd_lock);
+
+	return err;
+}
+
+int enic_dev_hang_notify(struct enic *enic)
+{
+	int err;
+
+	spin_lock(&enic->devcmd_lock);
+	err = vnic_dev_hang_notify(enic->vdev);
+	spin_unlock(&enic->devcmd_lock);
+
+	return err;
+}
+
+int enic_dev_set_ig_vlan_rewrite_mode(struct enic *enic)
+{
+	int err;
+
+	spin_lock(&enic->devcmd_lock);
+	err = vnic_dev_set_ig_vlan_rewrite_mode(enic->vdev,
+		IG_VLAN_REWRITE_MODE_PRIORITY_TAG_DEFAULT_VLAN);
+	spin_unlock(&enic->devcmd_lock);
+
+	return err;
+}
+
+int enic_dev_enable(struct enic *enic)
+{
+	int err;
+
+	spin_lock(&enic->devcmd_lock);
+	err = vnic_dev_enable_wait(enic->vdev);
+	spin_unlock(&enic->devcmd_lock);
+
+	return err;
+}
+
+int enic_dev_disable(struct enic *enic)
+{
+	int err;
+
+	spin_lock(&enic->devcmd_lock);
+	err = vnic_dev_disable(enic->vdev);
+	spin_unlock(&enic->devcmd_lock);
+
+	return err;
+}
+
+int enic_vnic_dev_deinit(struct enic *enic)
+{
+	int err;
+
+	spin_lock(&enic->devcmd_lock);
+	err = vnic_dev_deinit(enic->vdev);
+	spin_unlock(&enic->devcmd_lock);
+
+	return err;
+}
+
+int enic_dev_init_prov(struct enic *enic, struct vic_provinfo *vp)
+{
+	int err;
+
+	spin_lock(&enic->devcmd_lock);
+	err = vnic_dev_init_prov(enic->vdev,
+		(u8 *)vp, vic_provinfo_size(vp));
+	spin_unlock(&enic->devcmd_lock);
+
+	return err;
+}
+
+int enic_dev_init_done(struct enic *enic, int *done, int *error)
+{
+	int err;
+
+	spin_lock(&enic->devcmd_lock);
+	err = vnic_dev_init_done(enic->vdev, done, error);
+	spin_unlock(&enic->devcmd_lock);
+
+	return err;
+}
+
+/* rtnl lock is held */
+void enic_vlan_rx_add_vid(struct net_device *netdev, u16 vid)
+{
+	struct enic *enic = netdev_priv(netdev);
+
+	spin_lock(&enic->devcmd_lock);
+	enic_add_vlan(enic, vid);
+	spin_unlock(&enic->devcmd_lock);
+}
+
+/* rtnl lock is held */
+void enic_vlan_rx_kill_vid(struct net_device *netdev, u16 vid)
+{
+	struct enic *enic = netdev_priv(netdev);
+
+	spin_lock(&enic->devcmd_lock);
+	enic_del_vlan(enic, vid);
+	spin_unlock(&enic->devcmd_lock);
+}
diff --git a/drivers/net/enic/enic_dev.h b/drivers/net/enic/enic_dev.h
new file mode 100644
index 0000000..3ac6ba1
--- /dev/null
+++ b/drivers/net/enic/enic_dev.h
@@ -0,0 +1,42 @@
+/*
+ * Copyright 2011 Cisco Systems, Inc.  All rights reserved.
+ *
+ * This program is free software; you may redistribute it and/or modify
+ * it under the terms of the GNU General Public License as published by
+ * the Free Software Foundation; version 2 of the License.
+ *
+ * THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND,
+ * EXPRESS OR IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF
+ * MERCHANTABILITY, FITNESS FOR A PARTICULAR PURPOSE AND
+ * NONINFRINGEMENT. IN NO EVENT SHALL THE AUTHORS OR COPYRIGHT HOLDERS
+ * BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER LIABILITY, WHETHER IN AN
+ * ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM, OUT OF OR IN
+ * CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE
+ * SOFTWARE.
+ *
+ */
+
+#ifndef _ENIC_DEV_H_
+#define _ENIC_DEV_H_
+
+int enic_dev_fw_info(struct enic *enic, struct vnic_devcmd_fw_info **fw_info);
+int enic_dev_stats_dump(struct enic *enic, struct vnic_stats **vstats);
+int enic_dev_add_station_addr(struct enic *enic);
+int enic_dev_del_station_addr(struct enic *enic);
+int enic_dev_packet_filter(struct enic *enic, int directed, int multicast,
+	int broadcast, int promisc, int allmulti);
+int enic_dev_add_addr(struct enic *enic, u8 *addr);
+int enic_dev_del_addr(struct enic *enic, u8 *addr);
+void enic_vlan_rx_add_vid(struct net_device *netdev, u16 vid);
+void enic_vlan_rx_kill_vid(struct net_device *netdev, u16 vid);
+int enic_dev_hw_version(struct enic *enic, enum vnic_dev_hw_version *hw_ver);
+int enic_dev_notify_unset(struct enic *enic);
+int enic_dev_hang_notify(struct enic *enic);
+int enic_dev_set_ig_vlan_rewrite_mode(struct enic *enic);
+int enic_dev_enable(struct enic *enic);
+int enic_dev_disable(struct enic *enic);
+int enic_vnic_dev_deinit(struct enic *enic);
+int enic_dev_init_prov(struct enic *enic, struct vic_provinfo *vp);
+int enic_dev_init_done(struct enic *enic, int *done, int *error);
+
+#endif /* _ENIC_DEV_H_ */
diff --git a/drivers/net/enic/enic_main.c b/drivers/net/enic/enic_main.c
index 37f907b..3893370 100644
--- a/drivers/net/enic/enic_main.c
+++ b/drivers/net/enic/enic_main.c
@@ -44,6 +44,7 @@
 #include "vnic_vic.h"
 #include "enic_res.h"
 #include "enic.h"
+#include "enic_dev.h"
 
 #define ENIC_NOTIFY_TIMER_PERIOD	(2 * HZ)
 #define WQ_ENET_MAX_DESC_LEN		(1 << WQ_ENET_LEN_BITS)
@@ -190,18 +191,6 @@ static int enic_get_settings(struct net_device *netdev,
 	return 0;
 }
 
-static int enic_dev_fw_info(struct enic *enic,
-	struct vnic_devcmd_fw_info **fw_info)
-{
-	int err;
-
-	spin_lock(&enic->devcmd_lock);
-	err = vnic_dev_fw_info(enic->vdev, fw_info);
-	spin_unlock(&enic->devcmd_lock);
-
-	return err;
-}
-
 static void enic_get_drvinfo(struct net_device *netdev,
 	struct ethtool_drvinfo *drvinfo)
 {
@@ -246,17 +235,6 @@ static int enic_get_sset_count(struct net_device *netdev, int sset)
 	}
 }
 
-static int enic_dev_stats_dump(struct enic *enic, struct vnic_stats **vstats)
-{
-	int err;
-
-	spin_lock(&enic->devcmd_lock);
-	err = vnic_dev_stats_dump(enic->vdev, vstats);
-	spin_unlock(&enic->devcmd_lock);
-
-	return err;
-}
-
 static void enic_get_ethtool_stats(struct net_device *netdev,
 	struct ethtool_stats *stats, u64 *data)
 {
@@ -919,32 +897,6 @@ static int enic_set_mac_addr(struct net_device *netdev, char *addr)
 	return 0;
 }
 
-static int enic_dev_add_station_addr(struct enic *enic)
-{
-	int err = 0;
-
-	if (is_valid_ether_addr(enic->netdev->dev_addr)) {
-		spin_lock(&enic->devcmd_lock);
-		err = vnic_dev_add_addr(enic->vdev, enic->netdev->dev_addr);
-		spin_unlock(&enic->devcmd_lock);
-	}
-
-	return err;
-}
-
-static int enic_dev_del_station_addr(struct enic *enic)
-{
-	int err = 0;
-
-	if (is_valid_ether_addr(enic->netdev->dev_addr)) {
-		spin_lock(&enic->devcmd_lock);
-		err = vnic_dev_del_addr(enic->vdev, enic->netdev->dev_addr);
-		spin_unlock(&enic->devcmd_lock);
-	}
-
-	return err;
-}
-
 static int enic_set_mac_address_dynamic(struct net_device *netdev, void *p)
 {
 	struct enic *enic = netdev_priv(netdev);
@@ -989,41 +941,6 @@ static int enic_set_mac_address(struct net_device *netdev, void *p)
 	return enic_dev_add_station_addr(enic);
 }
 
-static int enic_dev_packet_filter(struct enic *enic, int directed,
-	int multicast, int broadcast, int promisc, int allmulti)
-{
-	int err;
-
-	spin_lock(&enic->devcmd_lock);
-	err = vnic_dev_packet_filter(enic->vdev, directed,
-		multicast, broadcast, promisc, allmulti);
-	spin_unlock(&enic->devcmd_lock);
-
-	return err;
-}
-
-static int enic_dev_add_addr(struct enic *enic, u8 *addr)
-{
-	int err;
-
-	spin_lock(&enic->devcmd_lock);
-	err = vnic_dev_add_addr(enic->vdev, addr);
-	spin_unlock(&enic->devcmd_lock);
-
-	return err;
-}
-
-static int enic_dev_del_addr(struct enic *enic, u8 *addr)
-{
-	int err;
-
-	spin_lock(&enic->devcmd_lock);
-	err = vnic_dev_del_addr(enic->vdev, addr);
-	spin_unlock(&enic->devcmd_lock);
-
-	return err;
-}
-
 static void enic_add_multicast_addr_list(struct enic *enic)
 {
 	struct net_device *netdev = enic->netdev;
@@ -1170,26 +1087,6 @@ static void enic_vlan_rx_register(struct net_device *netdev,
 	enic->vlan_group = vlan_group;
 }
 
-/* rtnl lock is held */
-static void enic_vlan_rx_add_vid(struct net_device *netdev, u16 vid)
-{
-	struct enic *enic = netdev_priv(netdev);
-
-	spin_lock(&enic->devcmd_lock);
-	enic_add_vlan(enic, vid);
-	spin_unlock(&enic->devcmd_lock);
-}
-
-/* rtnl lock is held */
-static void enic_vlan_rx_kill_vid(struct net_device *netdev, u16 vid)
-{
-	struct enic *enic = netdev_priv(netdev);
-
-	spin_lock(&enic->devcmd_lock);
-	enic_del_vlan(enic, vid);
-	spin_unlock(&enic->devcmd_lock);
-}
-
 /* netif_tx_lock held, BHs disabled */
 static void enic_tx_timeout(struct net_device *netdev)
 {
@@ -1197,40 +1094,6 @@ static void enic_tx_timeout(struct net_device *netdev)
 	schedule_work(&enic->reset);
 }
 
-static int enic_vnic_dev_deinit(struct enic *enic)
-{
-	int err;
-
-	spin_lock(&enic->devcmd_lock);
-	err = vnic_dev_deinit(enic->vdev);
-	spin_unlock(&enic->devcmd_lock);
-
-	return err;
-}
-
-static int enic_dev_init_prov(struct enic *enic, struct vic_provinfo *vp)
-{
-	int err;
-
-	spin_lock(&enic->devcmd_lock);
-	err = vnic_dev_init_prov(enic->vdev,
-		(u8 *)vp, vic_provinfo_size(vp));
-	spin_unlock(&enic->devcmd_lock);
-
-	return err;
-}
-
-static int enic_dev_init_done(struct enic *enic, int *done, int *error)
-{
-	int err;
-
-	spin_lock(&enic->devcmd_lock);
-	err = vnic_dev_init_done(enic->vdev, done, error);
-	spin_unlock(&enic->devcmd_lock);
-
-	return err;
-}
-
 static int enic_set_vf_mac(struct net_device *netdev, int vf, u8 *mac)
 {
 	struct enic *enic = netdev_priv(netdev);
@@ -1505,18 +1368,6 @@ static int enic_rq_alloc_buf_a1(struct vnic_rq *rq)
 	return 0;
 }
 
-static int enic_dev_hw_version(struct enic *enic,
-	enum vnic_dev_hw_version *hw_ver)
-{
-	int err;
-
-	spin_lock(&enic->devcmd_lock);
-	err = vnic_dev_hw_version(enic->vdev, hw_ver);
-	spin_unlock(&enic->devcmd_lock);
-
-	return err;
-}
-
 static int enic_set_rq_alloc_buf(struct enic *enic)
 {
 	enum vnic_dev_hw_version hw_ver;
@@ -1897,39 +1748,6 @@ static int enic_dev_notify_set(struct enic *enic)
 	return err;
 }
 
-static int enic_dev_notify_unset(struct enic *enic)
-{
-	int err;
-
-	spin_lock(&enic->devcmd_lock);
-	err = vnic_dev_notify_unset(enic->vdev);
-	spin_unlock(&enic->devcmd_lock);
-
-	return err;
-}
-
-static int enic_dev_enable(struct enic *enic)
-{
-	int err;
-
-	spin_lock(&enic->devcmd_lock);
-	err = vnic_dev_enable_wait(enic->vdev);
-	spin_unlock(&enic->devcmd_lock);
-
-	return err;
-}
-
-static int enic_dev_disable(struct enic *enic)
-{
-	int err;
-
-	spin_lock(&enic->devcmd_lock);
-	err = vnic_dev_disable(enic->vdev);
-	spin_unlock(&enic->devcmd_lock);
-
-	return err;
-}
-
 static void enic_notify_timer_start(struct enic *enic)
 {
 	switch (vnic_dev_get_intr_mode(enic->vdev)) {
@@ -2281,29 +2099,6 @@ static int enic_set_rss_nic_cfg(struct enic *enic)
 		rss_hash_bits, rss_base_cpu, rss_enable);
 }
 
-static int enic_dev_hang_notify(struct enic *enic)
-{
-	int err;
-
-	spin_lock(&enic->devcmd_lock);
-	err = vnic_dev_hang_notify(enic->vdev);
-	spin_unlock(&enic->devcmd_lock);
-
-	return err;
-}
-
-static int enic_dev_set_ig_vlan_rewrite_mode(struct enic *enic)
-{
-	int err;
-
-	spin_lock(&enic->devcmd_lock);
-	err = vnic_dev_set_ig_vlan_rewrite_mode(enic->vdev,
-		IG_VLAN_REWRITE_MODE_PRIORITY_TAG_DEFAULT_VLAN);
-	spin_unlock(&enic->devcmd_lock);
-
-	return err;
-}
-
 static void enic_reset(struct work_struct *work)
 {
 	struct enic *enic = container_of(work, struct enic, reset);


^ permalink raw reply related

* [net-next-2.6 PATCH 0/5] enic: updates to version 2.1.1.6
From: Vasanthy Kolluri @ 2011-02-05  2:17 UTC (permalink / raw)
  To: davem; +Cc: netdev

The following series implements enic driver updates:

1/5 - Clean up: Organize devcmd wrapper routines
2/5 - Bug Fix: Fix return values of enic_add/del_station_addr routines
3/5 - Bug Fix: Reorder firmware devcmds - CMD_INIT and CMD_IG_VLAN_REWRITE_MODE
4/5 - Clean up: Remove support for an older version of hardware
5/5 - Update MAINTAINERS

Signed-off-by: Christian Benvenuti <benve@cisco.com>
Signed-off-by: Vasanthy Kolluri <vkolluri@cisco.com>
Signed-off-by: Roopa Prabhu <roprabhu@cisco.com>
Signed-off-by: David Wang <dwang2@cisco.com>

^ permalink raw reply

* inetpeer: Move ICMP rate limiting state into inet_peer entries.
From: David Miller @ 2011-02-05  0:24 UTC (permalink / raw)
  To: netdev


Like metrics, the ICMP rate limiting bits are cached state about
a destination.  So move it into the inet_peer entries.

If an inet_peer cannot be bound (the reason is memory allocation
failure or similar), the policy is to allow.

Signed-off-by: David S. Miller <davem@davemloft.net>
---
 include/net/dst.h      |    2 -
 include/net/icmp.h     |    3 --
 include/net/inetpeer.h |    3 ++
 net/ipv4/icmp.c        |   49 ++++++-----------------------------------
 net/ipv4/inetpeer.c    |   43 ++++++++++++++++++++++++++++++++++++
 net/ipv4/route.c       |   56 ++++++++++++++++++++++++++++++++---------------
 net/ipv6/icmp.c        |   16 +++++++------
 net/ipv6/ip6_output.c  |    5 +++-
 net/ipv6/ndisc.c       |    4 ++-
 9 files changed, 108 insertions(+), 73 deletions(-)

diff --git a/include/net/dst.h b/include/net/dst.h
index 484f80b..e550195 100644
--- a/include/net/dst.h
+++ b/include/net/dst.h
@@ -78,8 +78,6 @@ struct dst_entry {
 	atomic_t		__refcnt;	/* client references	*/
 	int			__use;
 	unsigned long		lastuse;
-	unsigned long		rate_last;	/* rate limiting for ICMP */
-	unsigned int		rate_tokens;
 	int			flags;
 #define DST_HOST		0x0001
 #define DST_NOXFRM		0x0002
diff --git a/include/net/icmp.h b/include/net/icmp.h
index 6e991e0..f0698b9 100644
--- a/include/net/icmp.h
+++ b/include/net/icmp.h
@@ -45,7 +45,4 @@ extern int	icmp_ioctl(struct sock *sk, int cmd, unsigned long arg);
 extern int	icmp_init(void);
 extern void	icmp_out_count(struct net *net, unsigned char type);
 
-/* Move into dst.h ? */
-extern int 	xrlim_allow(struct dst_entry *dst, int timeout);
-
 #endif	/* _ICMP_H */
diff --git a/include/net/inetpeer.h b/include/net/inetpeer.h
index 61f2c66..ead2cb2 100644
--- a/include/net/inetpeer.h
+++ b/include/net/inetpeer.h
@@ -44,6 +44,8 @@ struct inet_peer {
 			__u32		tcp_ts;
 			__u32		tcp_ts_stamp;
 			u32		metrics[RTAX_MAX];
+			u32		rate_tokens;	/* rate limiting for ICMP */
+			unsigned long	rate_last;
 		};
 		struct rcu_head         rcu;
 	};
@@ -81,6 +83,7 @@ static inline struct inet_peer *inet_getpeer_v6(struct in6_addr *v6daddr, int cr
 
 /* can be called from BH context or outside */
 extern void inet_putpeer(struct inet_peer *p);
+extern bool inet_peer_xrlim_allow(struct inet_peer *peer, int timeout);
 
 /*
  * temporary check to make sure we dont access rid, ip_id_count, tcp_ts,
diff --git a/net/ipv4/icmp.c b/net/ipv4/icmp.c
index 4aa1b7f..ad2bcf1 100644
--- a/net/ipv4/icmp.c
+++ b/net/ipv4/icmp.c
@@ -233,48 +233,11 @@ static inline void icmp_xmit_unlock(struct sock *sk)
  *	Send an ICMP frame.
  */
 
-/*
- *	Check transmit rate limitation for given message.
- *	The rate information is held in the destination cache now.
- *	This function is generic and could be used for other purposes
- *	too. It uses a Token bucket filter as suggested by Alexey Kuznetsov.
- *
- *	Note that the same dst_entry fields are modified by functions in
- *	route.c too, but these work for packet destinations while xrlim_allow
- *	works for icmp destinations. This means the rate limiting information
- *	for one "ip object" is shared - and these ICMPs are twice limited:
- *	by source and by destination.
- *
- *	RFC 1812: 4.3.2.8 SHOULD be able to limit error message rate
- *			  SHOULD allow setting of rate limits
- *
- * 	Shared between ICMPv4 and ICMPv6.
- */
-#define XRLIM_BURST_FACTOR 6
-int xrlim_allow(struct dst_entry *dst, int timeout)
-{
-	unsigned long now, token = dst->rate_tokens;
-	int rc = 0;
-
-	now = jiffies;
-	token += now - dst->rate_last;
-	dst->rate_last = now;
-	if (token > XRLIM_BURST_FACTOR * timeout)
-		token = XRLIM_BURST_FACTOR * timeout;
-	if (token >= timeout) {
-		token -= timeout;
-		rc = 1;
-	}
-	dst->rate_tokens = token;
-	return rc;
-}
-EXPORT_SYMBOL(xrlim_allow);
-
-static inline int icmpv4_xrlim_allow(struct net *net, struct rtable *rt,
+static inline bool icmpv4_xrlim_allow(struct net *net, struct rtable *rt,
 		int type, int code)
 {
 	struct dst_entry *dst = &rt->dst;
-	int rc = 1;
+	bool rc = true;
 
 	if (type > NR_ICMP_TYPES)
 		goto out;
@@ -288,8 +251,12 @@ static inline int icmpv4_xrlim_allow(struct net *net, struct rtable *rt,
 		goto out;
 
 	/* Limit if icmp type is enabled in ratemask. */
-	if ((1 << type) & net->ipv4.sysctl_icmp_ratemask)
-		rc = xrlim_allow(dst, net->ipv4.sysctl_icmp_ratelimit);
+	if ((1 << type) & net->ipv4.sysctl_icmp_ratemask) {
+		if (!rt->peer)
+			rt_bind_peer(rt, 1);
+		rc = inet_peer_xrlim_allow(rt->peer,
+					   net->ipv4.sysctl_icmp_ratelimit);
+	}
 out:
 	return rc;
 }
diff --git a/net/ipv4/inetpeer.c b/net/ipv4/inetpeer.c
index b6513b1..709fbb4 100644
--- a/net/ipv4/inetpeer.c
+++ b/net/ipv4/inetpeer.c
@@ -513,6 +513,8 @@ struct inet_peer *inet_getpeer(struct inetpeer_addr *daddr, int create)
 		atomic_set(&p->ip_id_count, secure_ip_id(daddr->a4));
 		p->tcp_ts_stamp = 0;
 		p->metrics[RTAX_LOCK-1] = INETPEER_METRICS_NEW;
+		p->rate_tokens = 0;
+		p->rate_last = 0;
 		INIT_LIST_HEAD(&p->unused);
 
 
@@ -580,3 +582,44 @@ void inet_putpeer(struct inet_peer *p)
 	local_bh_enable();
 }
 EXPORT_SYMBOL_GPL(inet_putpeer);
+
+/*
+ *	Check transmit rate limitation for given message.
+ *	The rate information is held in the inet_peer entries now.
+ *	This function is generic and could be used for other purposes
+ *	too. It uses a Token bucket filter as suggested by Alexey Kuznetsov.
+ *
+ *	Note that the same inet_peer fields are modified by functions in
+ *	route.c too, but these work for packet destinations while xrlim_allow
+ *	works for icmp destinations. This means the rate limiting information
+ *	for one "ip object" is shared - and these ICMPs are twice limited:
+ *	by source and by destination.
+ *
+ *	RFC 1812: 4.3.2.8 SHOULD be able to limit error message rate
+ *			  SHOULD allow setting of rate limits
+ *
+ * 	Shared between ICMPv4 and ICMPv6.
+ */
+#define XRLIM_BURST_FACTOR 6
+bool inet_peer_xrlim_allow(struct inet_peer *peer, int timeout)
+{
+	unsigned long now, token;
+	bool rc = false;
+
+	if (!peer)
+		return true;
+
+	token = peer->rate_tokens;
+	now = jiffies;
+	token += now - peer->rate_last;
+	peer->rate_last = now;
+	if (token > XRLIM_BURST_FACTOR * timeout)
+		token = XRLIM_BURST_FACTOR * timeout;
+	if (token >= timeout) {
+		token -= timeout;
+		rc = true;
+	}
+	peer->rate_tokens = token;
+	return rc;
+}
+EXPORT_SYMBOL(inet_peer_xrlim_allow);
diff --git a/net/ipv4/route.c b/net/ipv4/route.c
index 0ba6a38..2e225da 100644
--- a/net/ipv4/route.c
+++ b/net/ipv4/route.c
@@ -1563,6 +1563,7 @@ void ip_rt_send_redirect(struct sk_buff *skb)
 {
 	struct rtable *rt = skb_rtable(skb);
 	struct in_device *in_dev;
+	struct inet_peer *peer;
 	int log_martians;
 
 	rcu_read_lock();
@@ -1574,33 +1575,41 @@ void ip_rt_send_redirect(struct sk_buff *skb)
 	log_martians = IN_DEV_LOG_MARTIANS(in_dev);
 	rcu_read_unlock();
 
+	if (!rt->peer)
+		rt_bind_peer(rt, 1);
+	peer = rt->peer;
+	if (!peer) {
+		icmp_send(skb, ICMP_REDIRECT, ICMP_REDIR_HOST, rt->rt_gateway);
+		return;
+	}
+
 	/* No redirected packets during ip_rt_redirect_silence;
 	 * reset the algorithm.
 	 */
-	if (time_after(jiffies, rt->dst.rate_last + ip_rt_redirect_silence))
-		rt->dst.rate_tokens = 0;
+	if (time_after(jiffies, peer->rate_last + ip_rt_redirect_silence))
+		peer->rate_tokens = 0;
 
 	/* Too many ignored redirects; do not send anything
 	 * set dst.rate_last to the last seen redirected packet.
 	 */
-	if (rt->dst.rate_tokens >= ip_rt_redirect_number) {
-		rt->dst.rate_last = jiffies;
+	if (peer->rate_tokens >= ip_rt_redirect_number) {
+		peer->rate_last = jiffies;
 		return;
 	}
 
 	/* Check for load limit; set rate_last to the latest sent
 	 * redirect.
 	 */
-	if (rt->dst.rate_tokens == 0 ||
+	if (peer->rate_tokens == 0 ||
 	    time_after(jiffies,
-		       (rt->dst.rate_last +
-			(ip_rt_redirect_load << rt->dst.rate_tokens)))) {
+		       (peer->rate_last +
+			(ip_rt_redirect_load << peer->rate_tokens)))) {
 		icmp_send(skb, ICMP_REDIRECT, ICMP_REDIR_HOST, rt->rt_gateway);
-		rt->dst.rate_last = jiffies;
-		++rt->dst.rate_tokens;
+		peer->rate_last = jiffies;
+		++peer->rate_tokens;
 #ifdef CONFIG_IP_ROUTE_VERBOSE
 		if (log_martians &&
-		    rt->dst.rate_tokens == ip_rt_redirect_number &&
+		    peer->rate_tokens == ip_rt_redirect_number &&
 		    net_ratelimit())
 			printk(KERN_WARNING "host %pI4/if%d ignores redirects for %pI4 to %pI4.\n",
 				&rt->rt_src, rt->rt_iif,
@@ -1612,7 +1621,9 @@ void ip_rt_send_redirect(struct sk_buff *skb)
 static int ip_error(struct sk_buff *skb)
 {
 	struct rtable *rt = skb_rtable(skb);
+	struct inet_peer *peer;
 	unsigned long now;
+	bool send;
 	int code;
 
 	switch (rt->dst.error) {
@@ -1632,15 +1643,24 @@ static int ip_error(struct sk_buff *skb)
 			break;
 	}
 
-	now = jiffies;
-	rt->dst.rate_tokens += now - rt->dst.rate_last;
-	if (rt->dst.rate_tokens > ip_rt_error_burst)
-		rt->dst.rate_tokens = ip_rt_error_burst;
-	rt->dst.rate_last = now;
-	if (rt->dst.rate_tokens >= ip_rt_error_cost) {
-		rt->dst.rate_tokens -= ip_rt_error_cost;
-		icmp_send(skb, ICMP_DEST_UNREACH, code, 0);
+	if (!rt->peer)
+		rt_bind_peer(rt, 1);
+	peer = rt->peer;
+
+	send = true;
+	if (peer) {
+		now = jiffies;
+		peer->rate_tokens += now - peer->rate_last;
+		if (peer->rate_tokens > ip_rt_error_burst)
+			peer->rate_tokens = ip_rt_error_burst;
+		peer->rate_last = now;
+		if (peer->rate_tokens >= ip_rt_error_cost)
+			peer->rate_tokens -= ip_rt_error_cost;
+		else
+			send = false;
 	}
+	if (send)
+		icmp_send(skb, ICMP_DEST_UNREACH, code, 0);
 
 out:	kfree_skb(skb);
 	return 0;
diff --git a/net/ipv6/icmp.c b/net/ipv6/icmp.c
index 03e62f9..a31d91b 100644
--- a/net/ipv6/icmp.c
+++ b/net/ipv6/icmp.c
@@ -157,20 +157,20 @@ static int is_ineligible(struct sk_buff *skb)
 /*
  * Check the ICMP output rate limit
  */
-static inline int icmpv6_xrlim_allow(struct sock *sk, u8 type,
-				     struct flowi *fl)
+static inline bool icmpv6_xrlim_allow(struct sock *sk, u8 type,
+				      struct flowi *fl)
 {
 	struct dst_entry *dst;
 	struct net *net = sock_net(sk);
-	int res = 0;
+	bool res = false;
 
 	/* Informational messages are not limited. */
 	if (type & ICMPV6_INFOMSG_MASK)
-		return 1;
+		return true;
 
 	/* Do not limit pmtu discovery, it would break it. */
 	if (type == ICMPV6_PKT_TOOBIG)
-		return 1;
+		return true;
 
 	/*
 	 * Look up the output route.
@@ -182,7 +182,7 @@ static inline int icmpv6_xrlim_allow(struct sock *sk, u8 type,
 		IP6_INC_STATS(net, ip6_dst_idev(dst),
 			      IPSTATS_MIB_OUTNOROUTES);
 	} else if (dst->dev && (dst->dev->flags&IFF_LOOPBACK)) {
-		res = 1;
+		res = true;
 	} else {
 		struct rt6_info *rt = (struct rt6_info *)dst;
 		int tmo = net->ipv6.sysctl.icmpv6_time;
@@ -191,7 +191,9 @@ static inline int icmpv6_xrlim_allow(struct sock *sk, u8 type,
 		if (rt->rt6i_dst.plen < 128)
 			tmo >>= ((128 - rt->rt6i_dst.plen)>>5);
 
-		res = xrlim_allow(dst, tmo);
+		if (!rt->rt6i_peer)
+			rt6_bind_peer(rt, 1);
+		res = inet_peer_xrlim_allow(rt->rt6i_peer, tmo);
 	}
 	dst_release(dst);
 	return res;
diff --git a/net/ipv6/ip6_output.c b/net/ipv6/ip6_output.c
index 5f8d242..2600e22 100644
--- a/net/ipv6/ip6_output.c
+++ b/net/ipv6/ip6_output.c
@@ -479,10 +479,13 @@ int ip6_forward(struct sk_buff *skb)
 		else
 			target = &hdr->daddr;
 
+		if (!rt->rt6i_peer)
+			rt6_bind_peer(rt, 1);
+
 		/* Limit redirects both by destination (here)
 		   and by source (inside ndisc_send_redirect)
 		 */
-		if (xrlim_allow(dst, 1*HZ))
+		if (inet_peer_xrlim_allow(rt->rt6i_peer, 1*HZ))
 			ndisc_send_redirect(skb, n, target);
 	} else {
 		int addrtype = ipv6_addr_type(&hdr->saddr);
diff --git a/net/ipv6/ndisc.c b/net/ipv6/ndisc.c
index 2342545..7254ce3 100644
--- a/net/ipv6/ndisc.c
+++ b/net/ipv6/ndisc.c
@@ -1553,7 +1553,9 @@ void ndisc_send_redirect(struct sk_buff *skb, struct neighbour *neigh,
 			   "ICMPv6 Redirect: destination is not a neighbour.\n");
 		goto release;
 	}
-	if (!xrlim_allow(dst, 1*HZ))
+	if (!rt->rt6i_peer)
+		rt6_bind_peer(rt, 1);
+	if (inet_peer_xrlim_allow(rt->rt6i_peer, 1*HZ))
 		goto release;
 
 	if (dev->addr_len) {
-- 
1.7.4


^ permalink raw reply related

* [net-next-2.6 PATCH] enic: Decouple mac address registration and deregistration from port profile set operation
From: Roopa Prabhu @ 2011-02-04 22:57 UTC (permalink / raw)
  To: davem; +Cc: netdev

From: Roopa Prabhu <roprabhu@cisco.com>

This patch removes VM mac address registration and deregistration code during
port profile set operation. We can delay mac address registration until
enic_open.

Signed-off-by: Roopa Prabhu <roprabhu@cisco.com>
Signed-off-by: David Wang <dwang2@cisco.com>
Signed-off-by: Christian Benvenuti <benve@cisco.com>
---
 drivers/net/enic/enic.h      |    2 +-
 drivers/net/enic/enic_main.c |    6 ------
 2 files changed, 1 insertions(+), 7 deletions(-)


diff --git a/drivers/net/enic/enic.h b/drivers/net/enic/enic.h
index ca3be4f..44865bb 100644
--- a/drivers/net/enic/enic.h
+++ b/drivers/net/enic/enic.h
@@ -32,7 +32,7 @@
 
 #define DRV_NAME		"enic"
 #define DRV_DESCRIPTION		"Cisco VIC Ethernet NIC Driver"
-#define DRV_VERSION		"2.1.1.2"
+#define DRV_VERSION		"2.1.1.2a"
 #define DRV_COPYRIGHT		"Copyright 2008-2011 Cisco Systems, Inc"
 
 #define ENIC_BARS_MAX		6
diff --git a/drivers/net/enic/enic_main.c b/drivers/net/enic/enic_main.c
index 89664c6..37f907b 100644
--- a/drivers/net/enic/enic_main.c
+++ b/drivers/net/enic/enic_main.c
@@ -1381,9 +1381,6 @@ static int enic_set_vf_port(struct net_device *netdev, int vf,
 
 		if (is_zero_ether_addr(netdev->dev_addr))
 			random_ether_addr(netdev->dev_addr);
-	} else if (new_pp.request == PORT_REQUEST_DISASSOCIATE) {
-		if (!is_zero_ether_addr(enic->pp.mac_addr))
-			enic_dev_del_addr(enic, enic->pp.mac_addr);
 	}
 
 	memcpy(&enic->pp, &new_pp, sizeof(struct enic_port_profile));
@@ -1392,9 +1389,6 @@ static int enic_set_vf_port(struct net_device *netdev, int vf,
 	if (err)
 		goto set_port_profile_cleanup;
 
-	if (!is_zero_ether_addr(enic->pp.mac_addr))
-		enic_dev_add_addr(enic, enic->pp.mac_addr);
-
 set_port_profile_cleanup:
 	memset(enic->pp.vf_mac, 0, ETH_ALEN);
 


^ permalink raw reply related

* [PATCH] ipv4: Don't miss existing cached metrics in new routes.
From: David Miller @ 2011-02-04 22:40 UTC (permalink / raw)
  To: netdev


Always lookup to see if we have an existing inetpeer entry for
a route.  Let FLOWI_FLAG_PRECOW_METRICS merely influence the
"create" argument to rt_bind_peer().

Also, call rt_bind_peer() unconditionally since it is not
possible for rt->peer to be non-NULL at this point.

Signed-off-by: David S. Miller <davem@davemloft.net>
---

Committed to net-next-2.6

 net/ipv4/route.c |   31 +++++++++++++++++--------------
 1 files changed, 17 insertions(+), 14 deletions(-)

diff --git a/net/ipv4/route.c b/net/ipv4/route.c
index e4c8165..0ba6a38 100644
--- a/net/ipv4/route.c
+++ b/net/ipv4/route.c
@@ -1859,25 +1859,28 @@ static unsigned int ipv4_default_mtu(const struct dst_entry *dst)
 
 static void rt_init_metrics(struct rtable *rt, struct fib_info *fi)
 {
-	if (!(rt->fl.flags & FLOWI_FLAG_PRECOW_METRICS)) {
-	no_cow:
-		if (fi->fib_metrics != (u32 *) dst_default_metrics) {
-			rt->fi = fi;
-			atomic_inc(&fi->fib_clntref);
-		}
-		dst_init_metrics(&rt->dst, fi->fib_metrics, true);
-	} else {
-		struct inet_peer *peer;
+	struct inet_peer *peer;
+	int create = 0;
 
-		if (!rt->peer)
-			rt_bind_peer(rt, 1);
-		peer = rt->peer;
-		if (!peer)
-			goto no_cow;
+	/* If a peer entry exists for this destination, we must hook
+	 * it up in order to get at cached metrics.
+	 */
+	if (rt->fl.flags & FLOWI_FLAG_PRECOW_METRICS)
+		create = 1;
+
+	rt_bind_peer(rt, create);
+	peer = rt->peer;
+	if (peer) {
 		if (inet_metrics_new(peer))
 			memcpy(peer->metrics, fi->fib_metrics,
 			       sizeof(u32) * RTAX_MAX);
 		dst_init_metrics(&rt->dst, peer->metrics, false);
+	} else {
+		if (fi->fib_metrics != (u32 *) dst_default_metrics) {
+			rt->fi = fi;
+			atomic_inc(&fi->fib_clntref);
+		}
+		dst_init_metrics(&rt->dst, fi->fib_metrics, true);
 	}
 }
 
-- 
1.7.4


^ permalink raw reply related

* Re: [PATCH 1/6] sysctl: faster reimplementation of sysctl_check_table
From: Lucian Adrian Grijincu @ 2011-02-04 21:34 UTC (permalink / raw)
  To: Eric W. Biederman
  Cc: linux-kernel, netdev, Eric Dumazet, David S. Miller,
	Octavian Purdila
In-Reply-To: <m1vd0zh5eh.fsf@fess.ebiederm.org>

On Fri, Feb 4, 2011 at 11:11 PM, Eric W. Biederman
<ebiederm@xmission.com> wrote:
>> +static int __sysctl_check_table(struct nsproxy *namespaces,
>> +     struct ctl_table *table, struct ctl_table **parents, int depth)
>>  {
>> +     const char *fail = NULL;
>>       int error = 0;
>> +
>> +     if (depth >= CTL_MAXNAME) {
>
> This should be depth > CTL_MAXNAME.  Because there are only CTL_MAXNAME
> entries in the array.


A bit lower in the array we access 'parents[depth]'.
So the correct check should be (depth >= CTL_MAXNAME) => error.


>> -                     sysctl_check_leaf(namespaces, table, &fail);
>> +                     parents[depth] = table;
>> +                     sysctl_check_leaf(namespaces, table, &fail,
>> +                                       parents, depth);
>>               }

>> +             if (table->child) {
>> +                     parents[depth] = table;
>> +                     error |= __sysctl_check_table(namespaces, table->child,
>> +                                                   parents, depth + 1);
>> +             }



-- 
 .
..: Lucian

^ permalink raw reply

* [GIT] Networking
From: David Miller @ 2011-02-04 21:18 UTC (permalink / raw)
  To: torvalds; +Cc: akpm, netdev, linux-kernel


Lots of little fixes in here, as is usually the case at this point.

The qeth fixes and the XEN netfront buggy guest handling are the
largest changes, with the ipv4/ipv6 multicast compat ioctl handling
bits next in terms of size.

1) Missing ref grab in netfilter conntrack netlink, from Pablo Neira Ayuso.
2) IPV6 boundary checks busted in xt_iprange module, from Thomas Jacob.
3) Wrong indexes used in loop in batman-adv module, leading to crash, fix
   from Linus Lüssing.
4) rtlwifi refactoring caused regressions in firmware upload, fix from
   Chaoming Li.
5) Fix error handling in ath5k_hw_dma_stop, from Bob Copeland.
6) Frame endianness fixes in ath5k, also from Bob Copeland.
7) Fix crashes due to ath9k tasklet shutdown races, from Stanislaw Gruszka.
8) Fix build warnings in econet, from Eric Dumazet.
9) Handle buggy guests setting of NETRX_csum_* in XEN netfront driver, from
   Ian Campbell.
10) Fix DMA-API debugging message spew in dl2k driver due to double
    unmaps.  Fix from Stanislaw Gruszka.
11) In rtnetlink, validate_linkmsg() uses wrong afinfo pointer.  Fix
    from Kurt Van Dijck.
12) __alloc_skb() needs kmemcheck annotations to avoid false positives,
    from Eric Dumazet.
13) Remove some complications wrt. ipv6 inetpeer binding.  All reports
    about triggered warnings et al. should be gone now.
14) napi_reuse_skb() must reset skb->dev and skb->iif, from Herbert Xu
    and Andy Gospodarek.
15) CAIF protocol headers need userspace export.  From Sjur Braendeland.
16) Fix oops on adding network namespace, fix from Eric W. Biederman.
17) ipv4 and ipv6 multicast ioctls need compat handling.  Based upon
    some initial work by Eric W. Biederman and some audits by Arnd
    Bergmann.
18) batman-adv module unhashes wrong object on failures, fix from
    Sven Eckelmann.
19) batman-adv forgets to free memory in free_info(), also from Sven.
20) batman-adv list traversal in send_vis_packets() drops lock inside
    it's loop, so we have to iterate over the head to be thread safe.
    Also from Sven Eckelmann.
21) bnx2x driver fixes from Yaniv Rosner and Eilon Greenstein.
22) wl12xx wireless has use-after-free error, fix from Mathias Krause.
23) Fix OOPS regression with IPSEC, ipv4/ipv6 blackhole dst ops need
    to implement a "default_mtu" method.  Fix from Roland Dreier.
24) CAN softing driver needs to depend upon IOMEM.
25) get_rps_cpu() erroneously elides flow processing when RPS map has
    length of one.  Fix from Tom Herbert.
26) ipv6 unregisters sysctl tables in wrong order, resulting in
    WARNING in unregister_sysctl_table().  Fix from Eric W. Biederman.
27) enc28j60 driver uses "sizeof(pointer)" when it means "sizeof(array)",
    fix from Stefan Weil.
28) vhost spews bogus RCU warnings, fix from Micahel S. Tsirkin.
29) arpt_mangle.c's checkentry() was changed to return int, but it
    still erroneously returns "false" and "true".  Fix from Pable
    Neira Ayuso.
30) Fix netfilter conntrack event filtering, also from Pablo.
31) SKB leak in ath_paprd_send_frame(), fix from Mohammed Shafi Shajakhan.
32) Mixed up boolean operators in vxge driver result in condition
    always true, fix from Stefan Weil.
33) Remove use of undefined operation in depca driver, from Alan Cox.
34) ISDN icn driver strncpy() usage fix, from Stefan Weil.
35) be2net driver erroneously mucks with tx queue status during link
    up/down, resulting in crashes.  Fix from Ajit Khaparde.
36) be2net illegally calls netif_stop_queue() before register_netdevice()
    happens, also from Ajit.
37) s390 qeth fixes from Ursula Braun, Frank Blaschka, and Stefan Weil.
38) Fix races between interface up/down and get_stats in NIU driver.
39) Like nlmsg_cancel(), genlmsg_cancel() must explicitly handle NULL
    second argument (before we subtract from it and it no longer
    looks like "NULL").  Analysis and fix from Julia Lawall.
40) R8169 bug fixes via Francois Romieu:
    a) 8168c needs RxFIFO overflow workaround too, from Ivan Vecera.
    b) Refine RxFIFO overflow logic to reset less, from Francois.
    c) Interrupt handler needs to be more careful in RxFIFO overflow
       cases, also from Francois.
41) Bridging code inserts FDB entries to the hash table before they
    are fully initialized.  Since FDB lookup uses RCU this is a serious
    issue, from Pavel Emelyanov.
42) Some CAN drivers create world-writable sysfs files, oops.  Fix from
    Vasiliy Kulikov.

Please pull, thanks a lot!

The following changes since commit 831d52bc153971b70e64eccfbed2b232394f22f8:

  x86, mm: avoid possible bogus tlb entries by clearing prev mm_cpumask after switching mm (2011-02-03 13:32:39 -0800)

are available in the git repository at:
  master.kernel.org:/pub/scm/linux/kernel/git/davem/net-2.6.git master

Ajit Khaparde (3):
      be2net: fix a crash seen during insmod/rmmod test
      be2net: remove netif_stop_queue being called before register_netdev.
      MAINTAINERS: update email ids of the be2net driver maintainers.

Alan Cox (1):
      depca: Fix warnings

Andy Gospodarek (1):
      gro: reset skb_iif on reuse

Bob Copeland (2):
      ath5k: fix error handling in ath5k_hw_dma_stop
      ath5k: correct endianness of frame duration

Chaoming Li (1):
      rtlwifi: Fix firmware upload errors

Chuck Ebbert (2):
      CAN: softing driver depends on IOMEM
      atl1c: Add missing PCI device ID

David S. Miller (9):
      ipv6: Remove route peer binding assertions.
      Merge branch 'batman-adv/merge-oopsonly' of git://git.open-mesh.org/ecsv/linux-merge
      Merge branch 'vhost-net' of git://git.kernel.org/.../mst/vhost
      Merge branch 'master' of git://git.kernel.org/.../linville/wireless-2.6
      Merge branch 'master' of git://git.kernel.org/.../kaber/nf-2.6
      niu: Fix races between up/down and get_stats.
      net: Fix bug in compat SIOCGETSGCNT handling.
      net: Support compat SIOCGETVIFCNT ioctl in ipv4.
      net: Provide compat support for SIOCGETMIFCNT_IN6 and SIOCGETSGCNT_IN6.

Eric Dumazet (2):
      econet: remove compiler warnings
      net: add kmemcheck annotation in __alloc_skb()

Eric W. Biederman (3):
      net: Fix ip link add netns oops
      net: Add compat ioctl support for the ipv4 multicast ioctl SIOCGETSGCNT
      net: Fix ipv6 neighbour unregister_sysctl_table warning

Francois Romieu (2):
      r8169: RxFIFO overflow oddities with 8168 chipsets.
      r8169: prevent RxFIFO induced loops in the irq handler.

Frank Blaschka (1):
      qeth: add more strict MTU checking

Herbert Xu (1):
      gro: Reset dev pointer on reuse

Ian Campbell (1):
      xen: netfront: handle incoming GSO SKBs which are not CHECKSUM_PARTIAL

Ivan Vecera (1):
      r8169: use RxFIFO overflow workaround for 8168c chipset.

Julia Lawall (1):
      include/net/genetlink.h: Allow genlmsg_cancel to accept a NULL argument

Ken Kawasaki (1):
      axnet_cs: reduce delay time at ei_rx_overrun

Kurt Van Dijck (1):
      net: fix validate_link_af in rtnetlink core

Linus Lüssing (1):
      batman-adv: Fix kernel panic when fetching vis data on a vis server

Luciano Coelho (1):
      MAINTAINERS: update information for the wl12xx driver

Mathias Krause (1):
      wl12xx: fix use after free

Michael S. Tsirkin (1):
      vhost: rcu annotation fixup

Mohammed Shafi Shajakhan (1):
      ath9k: Fix memory leak due to failed PAPRD frames

Oliver Hartkopp (1):
      slcan: fix referenced website in Kconfig help text

Pablo Neira Ayuso (3):
      netfilter: ctnetlink: fix missing refcount increment during dumps
      netfilter: arpt_mangle: fix return values of checkentry
      netfilter: ecache: always set events bits, filter them later

Pavel Emelyanov (1):
      bridge: Don't put partly initialized fdb into hash

Peter Chubb (1):
      tcp_ecn is an integer not a boolean

Rajkumar Manoharan (2):
      ath9k_hw: Fix system hang when resuming from S3/S4
      ath9k: Fix power save usage count imbalance on deinit

Roland Dreier (1):
      net: Add default_mtu() methods to blackhole dst_ops

Stanislaw Gruszka (3):
      ath9k: fix race conditions when stop device
      ath9k_htc: fix race conditions when stop device
      dl2k: nulify fraginfo after unmap

Stefan Weil (5):
      enc28j60: Fix reading of transmit status vector
      vxge: Fix wrong boolean operator
      isdn: icn: Fix potentially wrong string handling
      s390: Fix wrong size in memcmp (netiucv)
      s390: Fix possibly wrong size in strncmp (smsgiucv)

Sven Eckelmann (3):
      batman-adv: Remove vis info on hashing errors
      batman-adv: Remove vis info element in free_info
      batman-adv: Make vis info stack traversal threadsafe

Thomas Jacob (1):
      netfilter: xt_iprange: Incorrect xt_iprange boundary check for IPv6

Tom Herbert (1):
      net: Check rps_flow_table when RPS map length is 1

Ursula Braun (3):
      qeth: show new mac-address if its setting fails
      qeth: allow HiperSockets framesize change in suspend
      qeth: allow OSA CHPARM change in suspend state

Vasiliy Kulikov (2):
      net: can: at91_can: world-writable sysfs files
      net: can: janz-ican3: world-writable sysfs termination file

Vladislav Zolotarov (1):
      bnx2x: multicasts in NPAR mode

Yaniv Rosner (5):
      bnx2x: Remove setting XAUI low-power for BCM8073
      bnx2x: Fix LED blink rate on BCM84823
      bnx2x: Fix port swap for BCM8073
      bnx2x: Fix potential link loss in multi-function mode
      bnx2x: Update bnx2x version to 1.62.00-5

sjur.brandeland@stericsson.com (1):
      caif: bugfix - add caif headers for userspace usage.

 Documentation/networking/ip-sysctl.txt        |    2 +-
 MAINTAINERS                                   |   15 +--
 drivers/isdn/icn/icn.c                        |    3 +-
 drivers/net/atl1c/atl1c_main.c                |    1 +
 drivers/net/benet/be_main.c                   |    4 -
 drivers/net/bnx2x/bnx2x.h                     |    4 +-
 drivers/net/bnx2x/bnx2x_link.c                |   65 +++--------
 drivers/net/bnx2x/bnx2x_main.c                |   27 ++---
 drivers/net/can/Kconfig                       |    2 +-
 drivers/net/can/at91_can.c                    |    2 +-
 drivers/net/can/janz-ican3.c                  |    2 +-
 drivers/net/can/softing/Kconfig               |    2 +-
 drivers/net/depca.c                           |    6 +-
 drivers/net/dl2k.c                            |    4 +-
 drivers/net/enc28j60.c                        |    2 +-
 drivers/net/niu.c                             |   61 ++++++++---
 drivers/net/pcmcia/axnet_cs.c                 |    6 +-
 drivers/net/r8169.c                           |   41 ++++++--
 drivers/net/vxge/vxge-config.c                |    2 +-
 drivers/net/wireless/ath/ath5k/dma.c          |    4 +-
 drivers/net/wireless/ath/ath5k/pcu.c          |    4 +-
 drivers/net/wireless/ath/ath9k/ar9002_hw.c    |    3 +-
 drivers/net/wireless/ath/ath9k/htc_drv_init.c |    3 -
 drivers/net/wireless/ath/ath9k/htc_drv_main.c |   21 +++-
 drivers/net/wireless/ath/ath9k/init.c         |    7 +-
 drivers/net/wireless/ath/ath9k/main.c         |   19 +++-
 drivers/net/wireless/rtlwifi/efuse.c          |   40 ++++----
 drivers/net/wireless/wl12xx/spi.c             |    3 +-
 drivers/net/xen-netfront.c                    |   96 +++++++++++++++--
 drivers/s390/net/netiucv.c                    |    2 +-
 drivers/s390/net/qeth_core_main.c             |  149 +++++++++++++------------
 drivers/s390/net/qeth_l2_main.c               |    4 +-
 drivers/s390/net/smsgiucv.c                   |    2 +-
 drivers/vhost/net.c                           |    9 +-
 drivers/vhost/vhost.h                         |    6 +-
 include/linux/Kbuild                          |    1 +
 include/linux/caif/Kbuild                     |    2 +
 include/linux/mroute.h                        |    1 +
 include/linux/mroute6.h                       |    1 +
 include/net/genetlink.h                       |    3 +-
 include/net/netfilter/nf_conntrack_ecache.h   |    3 -
 include/net/sock.h                            |    2 +
 net/batman-adv/vis.c                          |   14 ++-
 net/bridge/br_fdb.c                           |    4 +-
 net/core/dev.c                                |    5 +-
 net/core/rtnetlink.c                          |    6 +-
 net/core/skbuff.c                             |    1 +
 net/econet/af_econet.c                        |    4 +-
 net/ipv4/af_inet.c                            |   16 +++
 net/ipv4/ipmr.c                               |   76 +++++++++++++
 net/ipv4/netfilter/arpt_mangle.c              |    6 +-
 net/ipv4/raw.c                                |   19 +++
 net/ipv4/route.c                              |    6 +
 net/ipv6/ip6mr.c                              |   75 +++++++++++++
 net/ipv6/raw.c                                |   19 +++
 net/ipv6/route.c                              |   10 +-
 net/ipv6/sysctl_net_ipv6.c                    |    9 ++-
 net/netfilter/nf_conntrack_ecache.c           |    3 +
 net/netfilter/nf_conntrack_netlink.c          |    1 +
 net/netfilter/xt_iprange.c                    |   16 +--
 60 files changed, 637 insertions(+), 289 deletions(-)
 create mode 100644 include/linux/caif/Kbuild

^ permalink raw reply

* Re: [PATCH 1/6] sysctl: faster reimplementation of sysctl_check_table
From: Eric W. Biederman @ 2011-02-04 21:11 UTC (permalink / raw)
  To: Lucian Adrian Grijincu
  Cc: linux-kernel, netdev, Eric Dumazet, David S. Miller,
	Octavian Purdila
In-Reply-To: <1296851485-6730-1-git-send-email-lucian.grijincu@gmail.com>

Lucian Adrian Grijincu <lucian.grijincu@gmail.com> writes:

> Determining the parent of a node at depth d
> - previous implementation: O(d)
> - current  implementation: O(1)
>
> Printing the path to a node at depth d
> - previous implementation: O(d^2)
> - current  implementation: O(d)
>
> This comes to a cost: we use an array ('parents') holding as many
> pointers as there can be sysctl levels (currently CTL_MAXNAME=10).
>
> The 'parents' array of pointers holds the same values as the
> ctl_table->parents field because the function that updates ->parents
> (sysctl_set_parent) is called with either NULL (for root nodes) or
> with sysctl_set_parent(table, table->child).

>
> Signed-off-by: Lucian Adrian Grijincu <lucian.grijincu@gmail.com>
> ---
>  kernel/sysctl_check.c |  130 +++++++++++++++++++++++++-----------------------
>  1 files changed, 68 insertions(+), 62 deletions(-)
>
> diff --git a/kernel/sysctl_check.c b/kernel/sysctl_check.c
> index 10b90d8..d23b085 100644
> --- a/kernel/sysctl_check.c
> +++ b/kernel/sysctl_check.c
> @@ -6,58 +6,34 @@
>  #include <net/ip_vs.h>
>  
>  
> -static int sysctl_depth(struct ctl_table *table)
> -{
> -	struct ctl_table *tmp;
> -	int depth;
> -
> -	depth = 0;
> -	for (tmp = table; tmp->parent; tmp = tmp->parent)
> -		depth++;
> -
> -	return depth;
> -}
> -
> -static struct ctl_table *sysctl_parent(struct ctl_table *table, int n)
> +static void sysctl_print_path(struct ctl_table *table,
> +			      struct ctl_table **parents, int depth)
>  {
> +	struct ctl_table *p;
>  	int i;
> -
> -	for (i = 0; table && i < n; i++)
> -		table = table->parent;
> -
> -	return table;
> -}
> -
> -
> -static void sysctl_print_path(struct ctl_table *table)
> -{
> -	struct ctl_table *tmp;
> -	int depth, i;
> -	depth = sysctl_depth(table);
>  	if (table->procname) {
> -		for (i = depth; i >= 0; i--) {
> -			tmp = sysctl_parent(table, i);
> -			printk("/%s", tmp->procname?tmp->procname:"");
> +		for (i = 0; i < depth; i++) {
> +			p = parents[i];
> +			printk("/%s", p->procname ? p->procname : "");
>  		}
> +		printk("/%s", table->procname);
>  	}
>  	printk(" ");
>  }
>  
>  static struct ctl_table *sysctl_check_lookup(struct nsproxy *namespaces,
> -						struct ctl_table *table)
> +	     struct ctl_table *table, struct ctl_table **parents, int depth)
>  {
>  	struct ctl_table_header *head;
>  	struct ctl_table *ref, *test;
> -	int depth, cur_depth;
> -
> -	depth = sysctl_depth(table);
> +	int cur_depth;
>  
>  	for (head = __sysctl_head_next(namespaces, NULL); head;
>  	     head = __sysctl_head_next(namespaces, head)) {
>  		cur_depth = depth;
>  		ref = head->ctl_table;
>  repeat:
> -		test = sysctl_parent(table, cur_depth);
> +		test = parents[depth - cur_depth];
>  		for (; ref->procname; ref++) {
>  			int match = 0;
>  			if (cur_depth && !ref->child)
> @@ -83,11 +59,12 @@ out:
>  	return ref;
>  }
>  
> -static void set_fail(const char **fail, struct ctl_table *table, const char *str)
> +static void set_fail(const char **fail, struct ctl_table *table,
> +	     const char *str, struct ctl_table **parents, int depth)
>  {
>  	if (*fail) {
>  		printk(KERN_ERR "sysctl table check failed: ");
> -		sysctl_print_path(table);
> +		sysctl_print_path(table, parents, depth);
>  		printk(" %s\n", *fail);
>  		dump_stack();
>  	}
> @@ -95,40 +72,55 @@ static void set_fail(const char **fail, struct ctl_table *table, const char *str
>  }
>  
>  static void sysctl_check_leaf(struct nsproxy *namespaces,
> -				struct ctl_table *table, const char **fail)
> +			      struct ctl_table *table, const char **fail,
> +			      struct ctl_table **parents, int depth)
>  {
>  	struct ctl_table *ref;
>  
> -	ref = sysctl_check_lookup(namespaces, table);
> -	if (ref && (ref != table))
> -		set_fail(fail, table, "Sysctl already exists");
> +	ref = sysctl_check_lookup(namespaces, table, parents, depth);
> +	if (ref && (ref != table)) {
> +		printk(KERN_ALERT "sysctl_check_leaf ref[%s], table[%s]\n", ref->procname, table->procname);
> +		set_fail(fail, table, "Sysctl already exists", parents, depth);
> +	}
>  }
>  
> -int sysctl_check_table(struct nsproxy *namespaces, struct ctl_table *table)
> +
> +
> +#define SET_FAIL(str) set_fail(&fail, table, str, parents, depth)
> +
> +static int __sysctl_check_table(struct nsproxy *namespaces,
> +	struct ctl_table *table, struct ctl_table **parents, int depth)
>  {
> +	const char *fail = NULL;
>  	int error = 0;
> +
> +	if (depth >= CTL_MAXNAME) {

This should be depth > CTL_MAXNAME.  Because there are only CTL_MAXNAME
entries in the array.

Eric


> +		SET_FAIL("Sysctl tree too deep");
> +		return -EINVAL;
> +	}
> +
>  	for (; table->procname; table++) {
> -		const char *fail = NULL;
> +		fail = NULL;
>  
>  		if (table->parent) {
>  			if (table->procname && !table->parent->procname)
> -				set_fail(&fail, table, "Parent without procname");
> +				SET_FAIL("Parent without procname");
>  		}
>  		if (!table->procname)
> -			set_fail(&fail, table, "No procname");
> +			SET_FAIL("No procname");
>  		if (table->child) {
>  			if (table->data)
> -				set_fail(&fail, table, "Directory with data?");
> +				SET_FAIL("Directory with data?");
>  			if (table->maxlen)
> -				set_fail(&fail, table, "Directory with maxlen?");
> +				SET_FAIL("Directory with maxlen?");
>  			if ((table->mode & (S_IRUGO|S_IXUGO)) != table->mode)
> -				set_fail(&fail, table, "Writable sysctl directory");
> +				SET_FAIL("Writable sysctl directory");
>  			if (table->proc_handler)
> -				set_fail(&fail, table, "Directory with proc_handler");
> +				SET_FAIL("Directory with proc_handler");
>  			if (table->extra1)
> -				set_fail(&fail, table, "Directory with extra1");
> +				SET_FAIL("Directory with extra1");
>  			if (table->extra2)
> -				set_fail(&fail, table, "Directory with extra2");
> +				SET_FAIL("Directory with extra2");
>  		} else {
>  			if ((table->proc_handler == proc_dostring) ||
>  			    (table->proc_handler == proc_dointvec) ||
> @@ -139,28 +131,42 @@ int sysctl_check_table(struct nsproxy *namespaces, struct ctl_table *table)
>  			    (table->proc_handler == proc_doulongvec_minmax) ||
>  			    (table->proc_handler == proc_doulongvec_ms_jiffies_minmax)) {
>  				if (!table->data)
> -					set_fail(&fail, table, "No data");
> +					SET_FAIL("No data");
>  				if (!table->maxlen)
> -					set_fail(&fail, table, "No maxlen");
> +					SET_FAIL("No maxlen");
>  			}
>  #ifdef CONFIG_PROC_SYSCTL
>  			if (table->procname && !table->proc_handler)
> -				set_fail(&fail, table, "No proc_handler");
> -#endif
> -#if 0
> -			if (!table->procname && table->proc_handler)
> -				set_fail(&fail, table, "proc_handler without procname");
> +				SET_FAIL("No proc_handler");
>  #endif
> -			sysctl_check_leaf(namespaces, table, &fail);
> +			parents[depth] = table;
> +			sysctl_check_leaf(namespaces, table, &fail,
> +					  parents, depth);
>  		}
>  		if (table->mode > 0777)
> -			set_fail(&fail, table, "bogus .mode");
> +			SET_FAIL("bogus .mode");
>  		if (fail) {
> -			set_fail(&fail, table, NULL);
> +			SET_FAIL(NULL);
>  			error = -EINVAL;
>  		}
> -		if (table->child)
> -			error |= sysctl_check_table(namespaces, table->child);
> +		if (table->child) {
> +			parents[depth] = table;
> +			error |= __sysctl_check_table(namespaces, table->child,
> +						      parents, depth + 1);
> +		}
>  	}
>  	return error;
>  }
> +
> +
> +int sysctl_check_table(struct nsproxy *namespaces, struct ctl_table *table)
> +{
> +	struct ctl_table *parents[CTL_MAXNAME];
> +	/* Keep track of parents as we go down into the tree.
> +	 *
> +	 * parents[i-1] will be the parent for parents[i].
> +	 * The node at depth 'd' will have the parent at parents[d-1].
> +	 * The root node (depth=0) has no parent in this array.
> +	 */
> +	return __sysctl_check_table(namespaces, table, parents, 0);
> +}

^ permalink raw reply

* Re: [PATCH 12/20] net: can: at91_can: world-writable sysfs files
From: David Miller @ 2011-02-04 21:06 UTC (permalink / raw)
  To: kurt.van.dijck-/BeEPy95v10
  Cc: security-DgEjT+Ai2ygdnm+yROfE0A, netdev-u79uwXL29TY76Z2rM5mHXA,
	linux-kernel-u79uwXL29TY76Z2rM5mHXA,
	socketcan-core-0fE9KPoRgkgATYTw5x5z8w,
	segoon-cxoSlKxDwOJWk0Htik3J/w, wg-5Yr1BZd7O62+XT7JhA+gdA
In-Reply-To: <20110204124233.GB334-MxZ6Iy/zr/UdbCeoMzGj59i2O/JbrIOy@public.gmane.org>

From: Kurt Van Dijck <kurt.van.dijck-/BeEPy95v10@public.gmane.org>
Date: Fri, 4 Feb 2011 13:42:33 +0100

> On Fri, Feb 04, 2011 at 03:23:50PM +0300, Vasiliy Kulikov wrote:
>> Don't allow everybody to write to mb0_id file.
>> 
> very well!
> 
> Acked-by: Kurt Van Dijck <kurt.van.dijck-/BeEPy95v10@public.gmane.org>

Applied.

^ permalink raw reply

* Re: [PATCH 13/20] net: can: janz-ican3: world-writable sysfs termination file
From: David Miller @ 2011-02-04 21:06 UTC (permalink / raw)
  To: segoon-cxoSlKxDwOJWk0Htik3J/w
  Cc: socketcan-core-0fE9KPoRgkgATYTw5x5z8w,
	netdev-u79uwXL29TY76Z2rM5mHXA, security-DgEjT+Ai2ygdnm+yROfE0A,
	linux-kernel-u79uwXL29TY76Z2rM5mHXA, wg-5Yr1BZd7O62+XT7JhA+gdA
In-Reply-To: <6b49b9521416fbd50214485d3e14e5f254ada4f7.1296818921.git.segoon-cxoSlKxDwOJWk0Htik3J/w@public.gmane.org>

From: Vasiliy Kulikov <segoon-cxoSlKxDwOJWk0Htik3J/w@public.gmane.org>
Date: Fri,  4 Feb 2011 15:23:53 +0300

> Don't allow everybody to set terminator via sysfs.
> 
> Signed-off-by: Vasiliy Kulikov <segoon-cxoSlKxDwOJWk0Htik3J/w@public.gmane.org>

Applied.

^ permalink raw reply

* Re: [PATCH] MAINTAINERS: update email ids of the be2net driver maintainers.
From: David Miller @ 2011-02-04 21:05 UTC (permalink / raw)
  To: ajit.khaparde; +Cc: netdev
In-Reply-To: <d49842f2-5ee8-4e17-b95c-f3d2108399f1@exht1.ad.emulex.com>

From: Ajit Khaparde <ajit.khaparde@emulex.com>
Date: Fri, 4 Feb 2011 09:31:29 -0600

> Signed-off-by: Ajit Khaparde <ajit.khaparde@emulex.com>

Applied, thanks.

^ permalink raw reply

* Re: [PATCH] bridge: Don't put partly initialized fdb into hash
From: David Miller @ 2011-02-04 21:02 UTC (permalink / raw)
  To: xemul; +Cc: shemminger, bridge, netdev
In-Reply-To: <4D4C2210.8010705@parallels.com>

From: Pavel Emelyanov <xemul@parallels.com>
Date: Fri, 04 Feb 2011 18:58:08 +0300

> The fdb_create() puts a new fdb into hash with only addr set. This is
> not good, since there are callers, that search the hash w/o the lock
> and access all the other its fields.
> 
> Applies to current netdev tree.
> 
> Signed-off-by: Pavel Emelyanov <xemul@openvz.org>

Whoa, good catch.  Applied, thanks!

^ permalink raw reply

* Re: [PATCH #2 0/0] r8169 driver fixes
From: David Miller @ 2011-02-04 20:49 UTC (permalink / raw)
  To: romieu; +Cc: netdev, ivecera, hayeswang
In-Reply-To: <20110204095832.GA9224@electric-eye.fr.zoreil.com>

From: Francois Romieu <romieu@fr.zoreil.com>
Date: Fri, 4 Feb 2011 10:58:32 +0100

> Rebased on top of davem/net-2.6.git.
> 
> The following series includes Ivan Rx fifo overflow fix and similar
> changes I did after testing with various 8168 chipsets.
> 
> The series is available as
> git://git.kernel.org/pub/scm/linux/kernel/git/romieu/netdev-2.6.git r8169-davem
> 
> to get the changes below.

Looks a lot better, pulled, thanks a lot Francois!

^ permalink raw reply

* Oops in tcp_output.c, kernel 2.6.38-rc3
From: Chuck Ebbert @ 2011-02-04 20:32 UTC (permalink / raw)
  To: netdev; +Cc: Ilpo Järvinen

Analysis is below. (From https://bugzilla.redhat.com/show_bug.cgi?id=674622)

 BUG: unable to handle kernel NULL pointer dereference at 0000000000000008
 IP: [<ffffffff81407b16>] tcp_write_xmit+0x694/0x7af
 PGD 0 
 Oops: 0002 [#1] SMP 
 last sysfs file: /sys/devices/system/cpu/cpu1/cache/index2/shared_cpu_map
 CPU 0 
 Modules linked in: nls_utf8 hfsplus hfs vfat fat ext2 usb_storage uas cpufreq_ondemand acpi_cpufreq freq_table mperf snd_hda_codec_hdmi snd_hda_codec_realtek snd_hda_intel e1000e btusb i2c_i801 snd_hda_codec serio_raw snd_hwdep atl1e snd_seq snd_seq_device snd_pcm iTCO_wdt iTCO_vendor_support snd_timer asus_atk0110 bluetooth rfkill snd microcode soundcore snd_page_alloc ipv6 firewire_ohci firewire_core crc_itu_t radeon ttm drm_kms_helper drm i2c_algo_bit i2c_core [last unloaded: scsi_wait_scan]
 Pid: 1411, comm: ssh Not tainted 2.6.38-0.rc3.git0.1.fc15.x86_64 #1 P5Q-PRO/P5Q-PRO
 RIP: 0010:[<ffffffff81407b16>]  [<ffffffff81407b16>] tcp_write_xmit+0x694/0x7af
 RSP: 0018:ffff88022373db88  EFLAGS: 00010202
 RAX: ffff88022644aa00 RBX: ffff880224178d00 RCX: 0000000000000001
 RDX: 0000000000000000 RSI: ffff88022644aa00 RDI: ffff88022644aa00
 RBP: ffff88022373dc08 R08: 0000000000000140 R09: ffff880223129000
 R10: 0000000000001c48 R11: 0000000000000005 R12: ffff88022644aa00
 R13: 0000000000000b50 R14: 00000000000005a8 R15: 0000000000000000
 FS:  00007fc81ed797e0(0000) GS:ffff8800cfc00000(0000) knlGS:0000000000000000
 CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
 CR2: 0000000000000008 CR3: 0000000223093000 CR4: 00000000000406f0
 DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
 DR3: 0000000000000000 DR6: 00000000ffff0ff0 DR7: 0000000000000400
 Process ssh (pid: 1411, threadinfo ffff88022373c000, task ffff8802237f4560)
 Stack:
  ffff880200003a00 0000000200000c90 0000000000000140 ffff88022644aa00
  0000000000000001 0000000100000140 000000202373dc08 ffffffff813b5197
  ffff880200000000 ffff880224178e08 ffff88022373dbe8 ffff880224178d00
 Call Trace:
  [<ffffffff813b5197>] ? __alloc_skb+0x8d/0x133
  [<ffffffff81407c88>] __tcp_push_pending_frames+0x23/0x51
  [<ffffffff813faa61>] tcp_push+0x8c/0x8e
  [<ffffffff813fcb30>] tcp_sendmsg+0x732/0x826
  [<ffffffff81418969>] inet_sendmsg+0x66/0x6f
  [<ffffffff813af53b>] __sock_sendmsg+0x69/0x76
  [<ffffffff813af601>] sock_aio_write+0xb9/0xc9
  [<ffffffff8112f627>] ? set_fd_set+0x3c/0x46
  [<ffffffff81120c57>] do_sync_write+0xbf/0xff
  [<ffffffff811e7967>] ? security_file_permission+0x2e/0x33
  [<ffffffff81121042>] ? rw_verify_area+0xb0/0xcd
  [<ffffffff811212d4>] vfs_write+0xb3/0xf3
  [<ffffffff811214bc>] sys_write+0x4a/0x6e
  [<ffffffff81009bc2>] system_call_fastpath+0x16/0x1b
 Code: f2 48 89 df 48 89 c6 e8 83 e4 ff ff 48 8b 45 98 48 89 c7 e8 df e5 ff ff 49 8b 14 24 48 8b 45 98 48 89 10 4c 89 60 08 49 89 04 24 <48> 89 42 08 ff 83 18 01 00 00 48 8b 05 59 9d 73 00 8b 4d b4 ba 
 RIP  [<ffffffff81407b16>] tcp_write_xmit+0x694/0x7af
  RSP <ffff88022373db88>
  CR2: 0000000000000008


OOPS is at include/linux/skbuff.h:895:
static inline void __skb_insert(struct sk_buff *newsk,
                                struct sk_buff *prev, struct sk_buff *next,
                                struct sk_buff_head *list)
{
        newsk->next = next;
        newsk->prev = prev;
==>     next->prev  = prev->next = newsk;
        list->qlen++;
}

  next is NULL here


Called from include/linux/skbuff.h:991:
static inline void __skb_queue_after(struct sk_buff_head *list,
                                     struct sk_buff *prev,
                                     struct sk_buff *newsk)
{
        __skb_insert(newsk, prev, prev->next, list);
}

Called from include/net/tcp.h:1294:
static inline void tcp_insert_write_queue_after(struct sk_buff *skb,
                                                struct sk_buff *buff,
                                                struct sock *sk)
{
        __skb_queue_after(&sk->sk_write_queue, skb, buff);
}

Called from net/ipv4/tcp_output.c:tso_fragment:1515:
        /* Link BUFF into the send queue. */
        skb_header_release(buff);
==>     tcp_insert_write_queue_after(skb, buff, sk);


Called from net/ipv4/tcp_output.c:tcp_write_xmit:1784:
                if (skb->len > limit &&
==>                 unlikely(tso_fragment(sk, skb, limit, mss_now, gfp)))
                        break;

^ permalink raw reply

* [PATCH 1/6] sysctl: faster reimplementation of sysctl_check_table
From: Lucian Adrian Grijincu @ 2011-02-04 20:31 UTC (permalink / raw)
  To: linux-kernel, netdev, Eric W. Biederman, Eric Dumazet,
	David S. Miller, Oct
  Cc: Lucian Adrian Grijincu
In-Reply-To: <m1oc6rio5u.fsf@fess.ebiederm.org>

Determining the parent of a node at depth d
- previous implementation: O(d)
- current  implementation: O(1)

Printing the path to a node at depth d
- previous implementation: O(d^2)
- current  implementation: O(d)

This comes to a cost: we use an array ('parents') holding as many
pointers as there can be sysctl levels (currently CTL_MAXNAME=10).

The 'parents' array of pointers holds the same values as the
ctl_table->parents field because the function that updates ->parents
(sysctl_set_parent) is called with either NULL (for root nodes) or
with sysctl_set_parent(table, table->child).

Signed-off-by: Lucian Adrian Grijincu <lucian.grijincu@gmail.com>
---
 kernel/sysctl_check.c |  130 +++++++++++++++++++++++++-----------------------
 1 files changed, 68 insertions(+), 62 deletions(-)

diff --git a/kernel/sysctl_check.c b/kernel/sysctl_check.c
index 10b90d8..d23b085 100644
--- a/kernel/sysctl_check.c
+++ b/kernel/sysctl_check.c
@@ -6,58 +6,34 @@
 #include <net/ip_vs.h>
 
 
-static int sysctl_depth(struct ctl_table *table)
-{
-	struct ctl_table *tmp;
-	int depth;
-
-	depth = 0;
-	for (tmp = table; tmp->parent; tmp = tmp->parent)
-		depth++;
-
-	return depth;
-}
-
-static struct ctl_table *sysctl_parent(struct ctl_table *table, int n)
+static void sysctl_print_path(struct ctl_table *table,
+			      struct ctl_table **parents, int depth)
 {
+	struct ctl_table *p;
 	int i;
-
-	for (i = 0; table && i < n; i++)
-		table = table->parent;
-
-	return table;
-}
-
-
-static void sysctl_print_path(struct ctl_table *table)
-{
-	struct ctl_table *tmp;
-	int depth, i;
-	depth = sysctl_depth(table);
 	if (table->procname) {
-		for (i = depth; i >= 0; i--) {
-			tmp = sysctl_parent(table, i);
-			printk("/%s", tmp->procname?tmp->procname:"");
+		for (i = 0; i < depth; i++) {
+			p = parents[i];
+			printk("/%s", p->procname ? p->procname : "");
 		}
+		printk("/%s", table->procname);
 	}
 	printk(" ");
 }
 
 static struct ctl_table *sysctl_check_lookup(struct nsproxy *namespaces,
-						struct ctl_table *table)
+	     struct ctl_table *table, struct ctl_table **parents, int depth)
 {
 	struct ctl_table_header *head;
 	struct ctl_table *ref, *test;
-	int depth, cur_depth;
-
-	depth = sysctl_depth(table);
+	int cur_depth;
 
 	for (head = __sysctl_head_next(namespaces, NULL); head;
 	     head = __sysctl_head_next(namespaces, head)) {
 		cur_depth = depth;
 		ref = head->ctl_table;
 repeat:
-		test = sysctl_parent(table, cur_depth);
+		test = parents[depth - cur_depth];
 		for (; ref->procname; ref++) {
 			int match = 0;
 			if (cur_depth && !ref->child)
@@ -83,11 +59,12 @@ out:
 	return ref;
 }
 
-static void set_fail(const char **fail, struct ctl_table *table, const char *str)
+static void set_fail(const char **fail, struct ctl_table *table,
+	     const char *str, struct ctl_table **parents, int depth)
 {
 	if (*fail) {
 		printk(KERN_ERR "sysctl table check failed: ");
-		sysctl_print_path(table);
+		sysctl_print_path(table, parents, depth);
 		printk(" %s\n", *fail);
 		dump_stack();
 	}
@@ -95,40 +72,55 @@ static void set_fail(const char **fail, struct ctl_table *table, const char *str
 }
 
 static void sysctl_check_leaf(struct nsproxy *namespaces,
-				struct ctl_table *table, const char **fail)
+			      struct ctl_table *table, const char **fail,
+			      struct ctl_table **parents, int depth)
 {
 	struct ctl_table *ref;
 
-	ref = sysctl_check_lookup(namespaces, table);
-	if (ref && (ref != table))
-		set_fail(fail, table, "Sysctl already exists");
+	ref = sysctl_check_lookup(namespaces, table, parents, depth);
+	if (ref && (ref != table)) {
+		printk(KERN_ALERT "sysctl_check_leaf ref[%s], table[%s]\n", ref->procname, table->procname);
+		set_fail(fail, table, "Sysctl already exists", parents, depth);
+	}
 }
 
-int sysctl_check_table(struct nsproxy *namespaces, struct ctl_table *table)
+
+
+#define SET_FAIL(str) set_fail(&fail, table, str, parents, depth)
+
+static int __sysctl_check_table(struct nsproxy *namespaces,
+	struct ctl_table *table, struct ctl_table **parents, int depth)
 {
+	const char *fail = NULL;
 	int error = 0;
+
+	if (depth >= CTL_MAXNAME) {
+		SET_FAIL("Sysctl tree too deep");
+		return -EINVAL;
+	}
+
 	for (; table->procname; table++) {
-		const char *fail = NULL;
+		fail = NULL;
 
 		if (table->parent) {
 			if (table->procname && !table->parent->procname)
-				set_fail(&fail, table, "Parent without procname");
+				SET_FAIL("Parent without procname");
 		}
 		if (!table->procname)
-			set_fail(&fail, table, "No procname");
+			SET_FAIL("No procname");
 		if (table->child) {
 			if (table->data)
-				set_fail(&fail, table, "Directory with data?");
+				SET_FAIL("Directory with data?");
 			if (table->maxlen)
-				set_fail(&fail, table, "Directory with maxlen?");
+				SET_FAIL("Directory with maxlen?");
 			if ((table->mode & (S_IRUGO|S_IXUGO)) != table->mode)
-				set_fail(&fail, table, "Writable sysctl directory");
+				SET_FAIL("Writable sysctl directory");
 			if (table->proc_handler)
-				set_fail(&fail, table, "Directory with proc_handler");
+				SET_FAIL("Directory with proc_handler");
 			if (table->extra1)
-				set_fail(&fail, table, "Directory with extra1");
+				SET_FAIL("Directory with extra1");
 			if (table->extra2)
-				set_fail(&fail, table, "Directory with extra2");
+				SET_FAIL("Directory with extra2");
 		} else {
 			if ((table->proc_handler == proc_dostring) ||
 			    (table->proc_handler == proc_dointvec) ||
@@ -139,28 +131,42 @@ int sysctl_check_table(struct nsproxy *namespaces, struct ctl_table *table)
 			    (table->proc_handler == proc_doulongvec_minmax) ||
 			    (table->proc_handler == proc_doulongvec_ms_jiffies_minmax)) {
 				if (!table->data)
-					set_fail(&fail, table, "No data");
+					SET_FAIL("No data");
 				if (!table->maxlen)
-					set_fail(&fail, table, "No maxlen");
+					SET_FAIL("No maxlen");
 			}
 #ifdef CONFIG_PROC_SYSCTL
 			if (table->procname && !table->proc_handler)
-				set_fail(&fail, table, "No proc_handler");
-#endif
-#if 0
-			if (!table->procname && table->proc_handler)
-				set_fail(&fail, table, "proc_handler without procname");
+				SET_FAIL("No proc_handler");
 #endif
-			sysctl_check_leaf(namespaces, table, &fail);
+			parents[depth] = table;
+			sysctl_check_leaf(namespaces, table, &fail,
+					  parents, depth);
 		}
 		if (table->mode > 0777)
-			set_fail(&fail, table, "bogus .mode");
+			SET_FAIL("bogus .mode");
 		if (fail) {
-			set_fail(&fail, table, NULL);
+			SET_FAIL(NULL);
 			error = -EINVAL;
 		}
-		if (table->child)
-			error |= sysctl_check_table(namespaces, table->child);
+		if (table->child) {
+			parents[depth] = table;
+			error |= __sysctl_check_table(namespaces, table->child,
+						      parents, depth + 1);
+		}
 	}
 	return error;
 }
+
+
+int sysctl_check_table(struct nsproxy *namespaces, struct ctl_table *table)
+{
+	struct ctl_table *parents[CTL_MAXNAME];
+	/* Keep track of parents as we go down into the tree.
+	 *
+	 * parents[i-1] will be the parent for parents[i].
+	 * The node at depth 'd' will have the parent at parents[d-1].
+	 * The root node (depth=0) has no parent in this array.
+	 */
+	return __sysctl_check_table(namespaces, table, parents, 0);
+}
-- 
1.7.4.rc1.7.g2cf08.dirty

^ permalink raw reply related

* Re: [PATCH] tcp: Increase the initial congestion window to 10.
From: Ilpo Järvinen @ 2011-02-04 19:50 UTC (permalink / raw)
  To: Yuchung Cheng
  Cc: David Miller, Netdev, therbert, H.K. Jerry Chu, Nandita Dukkipati
In-Reply-To: <AANLkTi=Zxp3VGt266MZ+NVmSFQhtnmUxPBFV_t2hcObZ@mail.gmail.com>

[-- Attachment #1: Type: TEXT/PLAIN, Size: 602 bytes --]

On Thu, 3 Feb 2011, Yuchung Cheng wrote:

> On Thu, Feb 3, 2011 at 2:43 PM, Ilpo Järvinen <ilpo.jarvinen@helsinki.fi> wrote:
> > It would perhaps be useful to change receiver advertized window to include
> > some extra segs initially. It should be >= IW + peer's dupThresh-1 as
> That's a very good point.
> 
> Maybe IRW should be IW + ssthresh - 1 since Linux also performs
> limited-transmit during fast-recovery as described in RFC 3517. This
> way sender can keep sending new data during the recovery as long as
> cwnd allows.

You're of course right, I forgot the recovery altogether :-).

-- 
 i.

^ permalink raw reply

* Re: [PATCH] tcp: Increase the initial congestion window to 10.
From: Ilpo Järvinen @ 2011-02-04 19:43 UTC (permalink / raw)
  To: H.K. Jerry Chu; +Cc: David Miller, Netdev, therbert, Jerry Chu
In-Reply-To: <AANLkTinkwzts5ysW26fHqX4u89Q=kW2kSOArqL=o6RLM@mail.gmail.com>

[-- Attachment #1: Type: TEXT/PLAIN, Size: 1688 bytes --]

On Thu, 3 Feb 2011, H.K. Jerry Chu wrote:

> On Thu, Feb 3, 2011 at 2:43 PM, Ilpo Järvinen <ilpo.jarvinen@helsinki.fi> wrote:
> > It would perhaps be useful to change receiver advertized window to include
> > some extra segs initially. It should be >= IW + peer's dupThresh-1 as
> > otherwise limited transmit won't work for the initial window because we
> > won't open more receiver window with dupacks (IIRC, I suppose Jerry might
> > be able to correct me right away if I'm wrong and we open window with
> > dupacks too?).
> 
> Sorry I don't know how the receive window is updated in Linux,
> autotuning or not.
> But I just wonder why would it have to do with dupacks, i.e., why would 
> it not slide forward as long as the left edge of the window slides 
> forward, regardless of OOO pkt arrival?

?? DupACK by defination does not slide the left edge?!? :-) ...It
certainly makes a difference whether the ACK is cumulative or not. 
Anyway, I tcpdumped it now and confirmed that advertized window is not 
advanced if OOO packet arrives.

> I am of the opinion that rwnd is for flow control purpose only thus should be
> fully decoupled from the cwnd of the other (sender) side. Therefore
> initrwnd should
> normally be based on projected BDP and local memory pressure, e.g., 64KB, not
> bearing any relation with IW of the other side. Only under special
> circumstances should it be used to constrain the sender, e.g., for
> devices behind slow links with
> very small buffer.

I also think along the lines that the advertized window autotuning code 
is just unnecessarily preventive (besides the IW change, also Quickstart 
couldn't be used that efficiently because of it).

-- 
 i.

^ permalink raw reply

* Re: [PATCH 2/5] sysctl: remove useless ctl_table->parent field
From: Eric W. Biederman @ 2011-02-04 19:41 UTC (permalink / raw)
  To: Lucian Adrian Grijincu
  Cc: linux-kernel, netdev, Eric Dumazet, David S. Miller,
	Octavian Purdila
In-Reply-To: <9a1977a6526ca9e0b03ba1df767f842aea62b5f4.1296793770.git.lucian.grijincu@gmail.com>

Lucian Adrian Grijincu <lucian.grijincu@gmail.com> writes:

> The 'parent' field was added for selinux in:
>     commit d912b0cc1a617d7c590d57b7ea971d50c7f02503
>     [PATCH] sysctl: add a parent entry to ctl_table and set the parent entry
>
> and then was used for sysctl_check_table.
>
> Both of the users have found other implementations.

This seems reasonable but we need to be careful in how we merge this so
the individual trees are correct.

> CC: Eric W. Biederman <ebiederm@xmission.com>
> Signed-off-by: Lucian Adrian Grijincu <lucian.grijincu@gmail.com>
> ---
>  include/linux/sysctl.h |    1 -
>  kernel/sysctl.c        |   11 -----------
>  kernel/sysctl_check.c  |    4 ++--
>  3 files changed, 2 insertions(+), 14 deletions(-)
>
> diff --git a/include/linux/sysctl.h b/include/linux/sysctl.h
> index 7bb5cb6..1f1da4b 100644
> --- a/include/linux/sysctl.h
> +++ b/include/linux/sysctl.h
> @@ -1018,7 +1018,6 @@ struct ctl_table
>  	int maxlen;
>  	mode_t mode;
>  	struct ctl_table *child;
> -	struct ctl_table *parent;	/* Automatically set */
>  	proc_handler *proc_handler;	/* Callback for text formatting */
>  	void *extra1;
>  	void *extra2;
> diff --git a/kernel/sysctl.c b/kernel/sysctl.c
> index 56f6fc1..42025ec 100644
> --- a/kernel/sysctl.c
> +++ b/kernel/sysctl.c
> @@ -1695,18 +1695,8 @@ int sysctl_perm(struct ctl_table_root *root, struct ctl_table *table, int op)
>  	return test_perm(mode, op);
>  }
>  
> -static void sysctl_set_parent(struct ctl_table *parent, struct ctl_table *table)
> -{
> -	for (; table->procname; table++) {
> -		table->parent = parent;
> -		if (table->child)
> -			sysctl_set_parent(table, table->child);
> -	}
> -}
> -
>  static __init int sysctl_init(void)
>  {
> -	sysctl_set_parent(NULL, root_table);
>  #ifdef CONFIG_SYSCTL_SYSCALL_CHECK
>  	sysctl_check_table(current->nsproxy, root_table);
>  #endif
> @@ -1864,7 +1854,6 @@ struct ctl_table_header *__register_sysctl_paths(
>  	header->used = 0;
>  	header->unregistering = NULL;
>  	header->root = root;
> -	sysctl_set_parent(NULL, header->ctl_table);
>  	header->count = 1;
>  #ifdef CONFIG_SYSCTL_SYSCALL_CHECK
>  	if (sysctl_check_table(namespaces, header->ctl_table)) {
> diff --git a/kernel/sysctl_check.c b/kernel/sysctl_check.c
> index 9b4fecd..b7d9c66 100644
> --- a/kernel/sysctl_check.c
> +++ b/kernel/sysctl_check.c
> @@ -95,8 +95,8 @@ static int __sysctl_check_table(struct nsproxy *namespaces,
>  	for (; table->procname; table++) {
>  		const char *fail = NULL;
>  
> -		if (table->parent) {
> -			if (table->procname && !table->parent->procname)
> +		if (depth != 0) { /* has parent */
> +			if (table->procname && !parents[depth - 1]->procname)
>  				SET_FAIL("Parent without procname");
>  		}
>  		if (!table->procname)

^ permalink raw reply


This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox