Netdev List

Netdev List
 help / color / mirror / Atom feed

* Re: pull request: wireless-next-2.6 2010-07-29
From: David Miller @ 2010-07-31  4:49 UTC (permalink / raw)
  To: linville; +Cc: linux-wireless, netdev
In-Reply-To: <20100729191245.GF2424@tuxdriver.com>

From: "John W. Linville" <linville@tuxdriver.com>
Date: Thu, 29 Jul 2010 15:12:46 -0400

> Yet another slew of changes intended for 2.6.36...
> 
> For the first time, this pull request includes a batch of bluetooth
> stuff by way of Marcel.  Some upcoming developments are likely to
> require more extensive integration between 802.11 and Bluetooth bits, so
> Marcel's tree will be feeding wireless-next-2.6 for a while.
> 
> The rest is the usual stuff from the usual suspects -- mostly driver
> updates with the usual strong showings from ath9k and iwlwifi, this time
> joined by libertas in particular.
> 
> This is a "for-davem" branch, so hopefully there will be no pain for you
> to pull this time. :-)

Pulled, thanks a lot John.

^ permalink raw reply

* [PATCH 2/2] nf_nat: don't check if the tuple is unique when there isn't any other choice
From: Changli Gao @ 2010-07-31  2:22 UTC (permalink / raw)
  To: Patrick McHardy; +Cc: David S. Miller, netfilter-devel, netdev, Changli Gao

the tuple got from unique_tuple() doesn't need to be really unique, so the
check for the unique tuple isn't necessary, when there isn't any other
choice. Eliminating the unnecessary nf_nat_used_tuple() can save some CPU
cycles too.

Signed-off-by: Changli Gao <xiaosuo@gmail.com>
----
 net/ipv4/netfilter/nf_nat_proto_common.c |    4 ++--
 net/ipv4/netfilter/nf_nat_proto_gre.c    |    4 ++--
 net/ipv4/netfilter/nf_nat_proto_icmp.c   |    4 ++--
 3 files changed, 6 insertions(+), 6 deletions(-)
diff --git a/net/ipv4/netfilter/nf_nat_proto_common.c b/net/ipv4/netfilter/nf_nat_proto_common.c
index 2844a03..95aa286 100644
--- a/net/ipv4/netfilter/nf_nat_proto_common.c
+++ b/net/ipv4/netfilter/nf_nat_proto_common.c
@@ -81,9 +81,9 @@ void nf_nat_proto_unique_tuple(struct nf_conntrack_tuple *tuple,
 	else
 		off = *rover;
 
-	for (i = 0; i < range_size; i++, off++) {
+	for (i = 0; 1; off++) {
 		*portptr = htons(min + off % range_size);
-		if (nf_nat_used_tuple(tuple, ct))
+		if (++i != range_size && nf_nat_used_tuple(tuple, ct))
 			continue;
 		if (!(range->flags & IP_NAT_RANGE_PROTO_RANDOM))
 			*rover = off;
diff --git a/net/ipv4/netfilter/nf_nat_proto_gre.c b/net/ipv4/netfilter/nf_nat_proto_gre.c
index 89933ab..1669be6 100644
--- a/net/ipv4/netfilter/nf_nat_proto_gre.c
+++ b/net/ipv4/netfilter/nf_nat_proto_gre.c
@@ -68,9 +68,9 @@ gre_unique_tuple(struct nf_conntrack_tuple *tuple,
 
 	pr_debug("min = %u, range_size = %u\n", min, range_size);
 
-	for (i = 0; i < range_size; i++, key++) {
+	for (i = 0; 1; key++) {
 		*keyptr = htons(min + key % range_size);
-		if (!nf_nat_used_tuple(tuple, ct))
+		if (++i == range_size || !nf_nat_used_tuple(tuple, ct))
 			return;
 	}
 
diff --git a/net/ipv4/netfilter/nf_nat_proto_icmp.c b/net/ipv4/netfilter/nf_nat_proto_icmp.c
index 97003fe..f6b07c9 100644
--- a/net/ipv4/netfilter/nf_nat_proto_icmp.c
+++ b/net/ipv4/netfilter/nf_nat_proto_icmp.c
@@ -42,10 +42,10 @@ icmp_unique_tuple(struct nf_conntrack_tuple *tuple,
 	if (!(range->flags & IP_NAT_RANGE_PROTO_SPECIFIED))
 		range_size = 0xFFFF;
 
-	for (i = 0; i < range_size; i++, id++) {
+	for (i = 0; 1; id++) {
 		tuple->src.u.icmp.id = htons(ntohs(range->min.icmp.id) +
 					     (id % range_size));
-		if (!nf_nat_used_tuple(tuple, ct))
+		if (++i == range_size || !nf_nat_used_tuple(tuple, ct))
 			return;
 	}
 	return;

^ permalink raw reply related

* [PATCH 1/2] nf_nat: make unique_tuple return void
From: Changli Gao @ 2010-07-31  2:15 UTC (permalink / raw)
  To: Patrick McHardy; +Cc: David S. Miller, netfilter-devel, netdev, Changli Gao

the only user of unique_tuple() get_unique_tuple() doesn't care about the
return value of unique_tuple(), so make unique_tuple() return void (nothing).

Signed-off-by: Changli Gao <xiaosuo@gmail.com>
----
 include/net/netfilter/nf_nat_protocol.h   |    8 ++++----
 net/ipv4/netfilter/nf_nat_proto_common.c  |    8 ++++----
 net/ipv4/netfilter/nf_nat_proto_dccp.c    |    6 +++---
 net/ipv4/netfilter/nf_nat_proto_gre.c     |    8 ++++----
 net/ipv4/netfilter/nf_nat_proto_icmp.c    |    6 +++---
 net/ipv4/netfilter/nf_nat_proto_sctp.c    |    6 +++---
 net/ipv4/netfilter/nf_nat_proto_tcp.c     |    5 ++---
 net/ipv4/netfilter/nf_nat_proto_udp.c     |    5 ++---
 net/ipv4/netfilter/nf_nat_proto_udplite.c |    6 +++---
 net/ipv4/netfilter/nf_nat_proto_unknown.c |    4 ++--
 10 files changed, 30 insertions(+), 32 deletions(-)
diff --git a/include/net/netfilter/nf_nat_protocol.h b/include/net/netfilter/nf_nat_protocol.h
index c398017..df17bac 100644
--- a/include/net/netfilter/nf_nat_protocol.h
+++ b/include/net/netfilter/nf_nat_protocol.h
@@ -27,9 +27,9 @@ struct nf_nat_protocol {
 
 	/* Alter the per-proto part of the tuple (depending on
 	   maniptype), to give a unique tuple in the given range if
-	   possible; return false if not.  Per-protocol part of tuple
-	   is initialized to the incoming packet. */
-	bool (*unique_tuple)(struct nf_conntrack_tuple *tuple,
+	   possible.  Per-protocol part of tuple is initialized to the
+	   incoming packet. */
+	void (*unique_tuple)(struct nf_conntrack_tuple *tuple,
 			     const struct nf_nat_range *range,
 			     enum nf_nat_manip_type maniptype,
 			     const struct nf_conn *ct);
@@ -63,7 +63,7 @@ extern bool nf_nat_proto_in_range(const struct nf_conntrack_tuple *tuple,
 				  const union nf_conntrack_man_proto *min,
 				  const union nf_conntrack_man_proto *max);
 
-extern bool nf_nat_proto_unique_tuple(struct nf_conntrack_tuple *tuple,
+extern void nf_nat_proto_unique_tuple(struct nf_conntrack_tuple *tuple,
 				      const struct nf_nat_range *range,
 				      enum nf_nat_manip_type maniptype,
 				      const struct nf_conn *ct,
diff --git a/net/ipv4/netfilter/nf_nat_proto_common.c b/net/ipv4/netfilter/nf_nat_proto_common.c
index 6c4f11f..2844a03 100644
--- a/net/ipv4/netfilter/nf_nat_proto_common.c
+++ b/net/ipv4/netfilter/nf_nat_proto_common.c
@@ -34,7 +34,7 @@ bool nf_nat_proto_in_range(const struct nf_conntrack_tuple *tuple,
 }
 EXPORT_SYMBOL_GPL(nf_nat_proto_in_range);
 
-bool nf_nat_proto_unique_tuple(struct nf_conntrack_tuple *tuple,
+void nf_nat_proto_unique_tuple(struct nf_conntrack_tuple *tuple,
 			       const struct nf_nat_range *range,
 			       enum nf_nat_manip_type maniptype,
 			       const struct nf_conn *ct,
@@ -53,7 +53,7 @@ bool nf_nat_proto_unique_tuple(struct nf_conntrack_tuple *tuple,
 	if (!(range->flags & IP_NAT_RANGE_PROTO_SPECIFIED)) {
 		/* If it's dst rewrite, can't change port */
 		if (maniptype == IP_NAT_MANIP_DST)
-			return false;
+			return;
 
 		if (ntohs(*portptr) < 1024) {
 			/* Loose convention: >> 512 is credential passing */
@@ -87,9 +87,9 @@ bool nf_nat_proto_unique_tuple(struct nf_conntrack_tuple *tuple,
 			continue;
 		if (!(range->flags & IP_NAT_RANGE_PROTO_RANDOM))
 			*rover = off;
-		return true;
+		return;
 	}
-	return false;
+	return;
 }
 EXPORT_SYMBOL_GPL(nf_nat_proto_unique_tuple);
 
diff --git a/net/ipv4/netfilter/nf_nat_proto_dccp.c b/net/ipv4/netfilter/nf_nat_proto_dccp.c
index 22485ce..570faf2 100644
--- a/net/ipv4/netfilter/nf_nat_proto_dccp.c
+++ b/net/ipv4/netfilter/nf_nat_proto_dccp.c
@@ -22,14 +22,14 @@
 
 static u_int16_t dccp_port_rover;
 
-static bool
+static void
 dccp_unique_tuple(struct nf_conntrack_tuple *tuple,
 		  const struct nf_nat_range *range,
 		  enum nf_nat_manip_type maniptype,
 		  const struct nf_conn *ct)
 {
-	return nf_nat_proto_unique_tuple(tuple, range, maniptype, ct,
-					 &dccp_port_rover);
+	nf_nat_proto_unique_tuple(tuple, range, maniptype, ct,
+				  &dccp_port_rover);
 }
 
 static bool
diff --git a/net/ipv4/netfilter/nf_nat_proto_gre.c b/net/ipv4/netfilter/nf_nat_proto_gre.c
index d7e8920..89933ab 100644
--- a/net/ipv4/netfilter/nf_nat_proto_gre.c
+++ b/net/ipv4/netfilter/nf_nat_proto_gre.c
@@ -37,7 +37,7 @@ MODULE_AUTHOR("Harald Welte <laforge@gnumonks.org>");
 MODULE_DESCRIPTION("Netfilter NAT protocol helper module for GRE");
 
 /* generate unique tuple ... */
-static bool
+static void
 gre_unique_tuple(struct nf_conntrack_tuple *tuple,
 		 const struct nf_nat_range *range,
 		 enum nf_nat_manip_type maniptype,
@@ -50,7 +50,7 @@ gre_unique_tuple(struct nf_conntrack_tuple *tuple,
 	/* If there is no master conntrack we are not PPTP,
 	   do not change tuples */
 	if (!ct->master)
-		return false;
+		return;
 
 	if (maniptype == IP_NAT_MANIP_SRC)
 		keyptr = &tuple->src.u.gre.key;
@@ -71,11 +71,11 @@ gre_unique_tuple(struct nf_conntrack_tuple *tuple,
 	for (i = 0; i < range_size; i++, key++) {
 		*keyptr = htons(min + key % range_size);
 		if (!nf_nat_used_tuple(tuple, ct))
-			return true;
+			return;
 	}
 
 	pr_debug("%p: no NAT mapping\n", ct);
-	return false;
+	return;
 }
 
 /* manipulate a GRE packet according to maniptype */
diff --git a/net/ipv4/netfilter/nf_nat_proto_icmp.c b/net/ipv4/netfilter/nf_nat_proto_icmp.c
index 19a8b0b..97003fe 100644
--- a/net/ipv4/netfilter/nf_nat_proto_icmp.c
+++ b/net/ipv4/netfilter/nf_nat_proto_icmp.c
@@ -27,7 +27,7 @@ icmp_in_range(const struct nf_conntrack_tuple *tuple,
 	       ntohs(tuple->src.u.icmp.id) <= ntohs(max->icmp.id);
 }
 
-static bool
+static void
 icmp_unique_tuple(struct nf_conntrack_tuple *tuple,
 		  const struct nf_nat_range *range,
 		  enum nf_nat_manip_type maniptype,
@@ -46,9 +46,9 @@ icmp_unique_tuple(struct nf_conntrack_tuple *tuple,
 		tuple->src.u.icmp.id = htons(ntohs(range->min.icmp.id) +
 					     (id % range_size));
 		if (!nf_nat_used_tuple(tuple, ct))
-			return true;
+			return;
 	}
-	return false;
+	return;
 }
 
 static bool
diff --git a/net/ipv4/netfilter/nf_nat_proto_sctp.c b/net/ipv4/netfilter/nf_nat_proto_sctp.c
index 3fc598e..756331d 100644
--- a/net/ipv4/netfilter/nf_nat_proto_sctp.c
+++ b/net/ipv4/netfilter/nf_nat_proto_sctp.c
@@ -16,14 +16,14 @@
 
 static u_int16_t nf_sctp_port_rover;
 
-static bool
+static void
 sctp_unique_tuple(struct nf_conntrack_tuple *tuple,
 		  const struct nf_nat_range *range,
 		  enum nf_nat_manip_type maniptype,
 		  const struct nf_conn *ct)
 {
-	return nf_nat_proto_unique_tuple(tuple, range, maniptype, ct,
-					 &nf_sctp_port_rover);
+	nf_nat_proto_unique_tuple(tuple, range, maniptype, ct,
+				  &nf_sctp_port_rover);
 }
 
 static bool
diff --git a/net/ipv4/netfilter/nf_nat_proto_tcp.c b/net/ipv4/netfilter/nf_nat_proto_tcp.c
index 399e2cf..aa460a5 100644
--- a/net/ipv4/netfilter/nf_nat_proto_tcp.c
+++ b/net/ipv4/netfilter/nf_nat_proto_tcp.c
@@ -20,14 +20,13 @@
 
 static u_int16_t tcp_port_rover;
 
-static bool
+static void
 tcp_unique_tuple(struct nf_conntrack_tuple *tuple,
 		 const struct nf_nat_range *range,
 		 enum nf_nat_manip_type maniptype,
 		 const struct nf_conn *ct)
 {
-	return nf_nat_proto_unique_tuple(tuple, range, maniptype, ct,
-					 &tcp_port_rover);
+	nf_nat_proto_unique_tuple(tuple, range, maniptype, ct, &tcp_port_rover);
 }
 
 static bool
diff --git a/net/ipv4/netfilter/nf_nat_proto_udp.c b/net/ipv4/netfilter/nf_nat_proto_udp.c
index 9e61c79..dfe65c7 100644
--- a/net/ipv4/netfilter/nf_nat_proto_udp.c
+++ b/net/ipv4/netfilter/nf_nat_proto_udp.c
@@ -19,14 +19,13 @@
 
 static u_int16_t udp_port_rover;
 
-static bool
+static void
 udp_unique_tuple(struct nf_conntrack_tuple *tuple,
 		 const struct nf_nat_range *range,
 		 enum nf_nat_manip_type maniptype,
 		 const struct nf_conn *ct)
 {
-	return nf_nat_proto_unique_tuple(tuple, range, maniptype, ct,
-					 &udp_port_rover);
+	nf_nat_proto_unique_tuple(tuple, range, maniptype, ct, &udp_port_rover);
 }
 
 static bool
diff --git a/net/ipv4/netfilter/nf_nat_proto_udplite.c b/net/ipv4/netfilter/nf_nat_proto_udplite.c
index 440a229..3cc8c8a 100644
--- a/net/ipv4/netfilter/nf_nat_proto_udplite.c
+++ b/net/ipv4/netfilter/nf_nat_proto_udplite.c
@@ -18,14 +18,14 @@
 
 static u_int16_t udplite_port_rover;
 
-static bool
+static void
 udplite_unique_tuple(struct nf_conntrack_tuple *tuple,
 		     const struct nf_nat_range *range,
 		     enum nf_nat_manip_type maniptype,
 		     const struct nf_conn *ct)
 {
-	return nf_nat_proto_unique_tuple(tuple, range, maniptype, ct,
-					 &udplite_port_rover);
+	nf_nat_proto_unique_tuple(tuple, range, maniptype, ct,
+				  &udplite_port_rover);
 }
 
 static bool
diff --git a/net/ipv4/netfilter/nf_nat_proto_unknown.c b/net/ipv4/netfilter/nf_nat_proto_unknown.c
index 14381c6..a50f2bc 100644
--- a/net/ipv4/netfilter/nf_nat_proto_unknown.c
+++ b/net/ipv4/netfilter/nf_nat_proto_unknown.c
@@ -26,14 +26,14 @@ static bool unknown_in_range(const struct nf_conntrack_tuple *tuple,
 	return true;
 }
 
-static bool unknown_unique_tuple(struct nf_conntrack_tuple *tuple,
+static void unknown_unique_tuple(struct nf_conntrack_tuple *tuple,
 				 const struct nf_nat_range *range,
 				 enum nf_nat_manip_type maniptype,
 				 const struct nf_conn *ct)
 {
 	/* Sorry: we can't help you; if it's not unique, we can't frob
 	   anything. */
-	return false;
+	return;
 }
 
 static bool

^ permalink raw reply related

* [RFC PATCH 2/2] igb/ixgbe: add code to trigger function reset if reset_devices is set
From: Jeff Kirsher @ 2010-07-31  0:59 UTC (permalink / raw)
  To: davem, jbarnes; +Cc: netdev, linux-pci, Alexander Duyck, Jeff Kirsher
In-Reply-To: <20100731005803.32625.6891.stgit@localhost.localdomain>

From: Alexander Duyck <alexander.h.duyck@intel.com>

This change makes it so that both igb and ixgbe can trigger a full pcie
function reset if the reset_devices kernel parameter is defined.  The main
reason for adding this is that kdump can cause serious issues when the
kdump kernel resets the IOMMU while DMA transactions are still occurring.

Signed-off-by: Alexander Duyck <alexander.h.duyck@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
---

 drivers/net/igb/igb_main.c     |    3 +++
 drivers/net/ixgbe/ixgbe_main.c |    3 +++
 2 files changed, 6 insertions(+), 0 deletions(-)

diff --git a/drivers/net/igb/igb_main.c b/drivers/net/igb/igb_main.c
index 667b527..b924443 100644
--- a/drivers/net/igb/igb_main.c
+++ b/drivers/net/igb/igb_main.c
@@ -1731,6 +1731,9 @@ static int __devinit igb_probe(struct pci_dev *pdev,
 		return -EINVAL;
 	}
 
+	if (reset_devices && pci_reset_device_function(pdev))
+		return -ENODEV;
+
 	err = pci_enable_device_mem(pdev);
 	if (err)
 		return err;
diff --git a/drivers/net/ixgbe/ixgbe_main.c b/drivers/net/ixgbe/ixgbe_main.c
index 7d6a415..f459f24 100644
--- a/drivers/net/ixgbe/ixgbe_main.c
+++ b/drivers/net/ixgbe/ixgbe_main.c
@@ -6548,6 +6548,9 @@ static int __devinit ixgbe_probe(struct pci_dev *pdev,
 		return -EINVAL;
 	}
 
+	if (reset_devices && pci_reset_device_function(pdev))
+		return -ENODEV;
+
 	err = pci_enable_device_mem(pdev);
 	if (err)
 		return err;


^ permalink raw reply related

* [RFC PATCH 1/2] pci: add function reset call that can be used inside of probe
From: Jeff Kirsher @ 2010-07-31  0:58 UTC (permalink / raw)
  To: davem, jbarnes; +Cc: netdev, linux-pci, Alexander Duyck, Jeff Kirsher

From: Alexander Duyck <alexander.h.duyck@intel.com>

This change makes it so that there are several new calls available.

The first is __pci_reset_dev which works similar to pci_reset_dev, however
it does not obtain the device lock.  This is important as I found several
cases such as __pci_reset_function in which the call was obtaining the
device lock even though the lock had yet to be initialized.  In addition if
one wishes to do such a reset during probe it will hang since the device
lock is already being held.

The second change that was added was a function named
pci_reset_device_function.  This function is similar to pci_reset_function
however it does not hold the device lock and so it as well can be called
during the driver probe routine.

Signed-off-by: Alexander Duyck <alexander.h.duyck@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
---

 drivers/pci/pci.c   |   74 +++++++++++++++++++++++++++++++++++++++++++--------
 include/linux/pci.h |    1 +
 2 files changed, 64 insertions(+), 11 deletions(-)

diff --git a/drivers/pci/pci.c b/drivers/pci/pci.c
index 60f30e7..1421bc7 100644
--- a/drivers/pci/pci.c
+++ b/drivers/pci/pci.c
@@ -2445,17 +2445,14 @@ static int pci_parent_bus_reset(struct pci_dev *dev, int probe)
 	return 0;
 }
 
-static int pci_dev_reset(struct pci_dev *dev, int probe)
+static int __pci_dev_reset(struct pci_dev *dev, int probe)
 {
 	int rc;
 
 	might_sleep();
 
-	if (!probe) {
+	if (!probe)
 		pci_block_user_cfg_access(dev);
-		/* block PM suspend, driver probe, etc. */
-		device_lock(&dev->dev);
-	}
 
 	rc = pci_dev_specific_reset(dev, probe);
 	if (rc != -ENOTTY)
@@ -2474,11 +2471,26 @@ static int pci_dev_reset(struct pci_dev *dev, int probe)
 		goto done;
 
 	rc = pci_parent_bus_reset(dev, probe);
+
 done:
-	if (!probe) {
-		device_unlock(&dev->dev);
+	if (!probe)
 		pci_unblock_user_cfg_access(dev);
-	}
+
+	return rc;
+}
+
+static int pci_dev_reset(struct pci_dev *dev, int probe)
+{
+	int rc;
+
+	/* block PM suspend, driver probe, etc. */
+	if (!probe)
+		device_lock(&dev->dev);
+
+	rc = __pci_dev_reset(dev, probe);
+
+	if (!probe)
+		device_unlock(&dev->dev);
 
 	return rc;
 }
@@ -2502,7 +2514,7 @@ done:
  */
 int __pci_reset_function(struct pci_dev *dev)
 {
-	return pci_dev_reset(dev, 0);
+	return __pci_dev_reset(dev, 0);
 }
 EXPORT_SYMBOL_GPL(__pci_reset_function);
 
@@ -2519,7 +2531,7 @@ EXPORT_SYMBOL_GPL(__pci_reset_function);
  */
 int pci_probe_reset_function(struct pci_dev *dev)
 {
-	return pci_dev_reset(dev, 1);
+	return __pci_dev_reset(dev, 1);
 }
 
 /**
@@ -2542,7 +2554,7 @@ int pci_reset_function(struct pci_dev *dev)
 {
 	int rc;
 
-	rc = pci_dev_reset(dev, 1);
+	rc = __pci_dev_reset(dev, 1);
 	if (rc)
 		return rc;
 
@@ -2563,6 +2575,46 @@ int pci_reset_function(struct pci_dev *dev)
 EXPORT_SYMBOL_GPL(pci_reset_function);
 
 /**
+ * pci_reset_device_function - quiesce and reinitialize a PCI device function
+ * @dev: PCI device to reset
+ *
+ * Some devices allow an individual function to be reset without affecting
+ * other functions in the same device.  The PCI device must be responsive
+ * to PCI config space in order to use this function.
+ *
+ * This function is very similar to pci_reset_function, however this function
+ * does not obtain the device lock during the reset.  This is due to the fact
+ * that the call is meant to be used during probe if the reset_devices
+ * kernel parameter is set.
+ *
+ * Returns 0 if the device function was successfully reset or negative if the
+ * device doesn't support resetting a single function.
+ */
+int pci_reset_device_function(struct pci_dev *dev)
+{
+	int rc;
+
+	rc = __pci_dev_reset(dev, 1);
+	if (rc)
+		return rc;
+
+	pci_save_state(dev);
+
+	/*
+	 * both INTx and MSI are disabled after the Interrupt Disable bit
+	 * is set and the Bus Master bit is cleared.
+	 */
+	pci_write_config_word(dev, PCI_COMMAND, PCI_COMMAND_INTX_DISABLE);
+
+	rc = __pci_dev_reset(dev, 0);
+
+	pci_restore_state(dev);
+
+	return rc;
+}
+EXPORT_SYMBOL_GPL(pci_reset_device_function);
+
+/**
  * pcix_get_max_mmrbc - get PCI-X maximum designed memory read byte count
  * @dev: PCI device to query
  *
diff --git a/include/linux/pci.h b/include/linux/pci.h
index 6a471ab..c7708d3 100644
--- a/include/linux/pci.h
+++ b/include/linux/pci.h
@@ -789,6 +789,7 @@ int pcie_get_readrq(struct pci_dev *dev);
 int pcie_set_readrq(struct pci_dev *dev, int rq);
 int __pci_reset_function(struct pci_dev *dev);
 int pci_reset_function(struct pci_dev *dev);
+int pci_reset_device_function(struct pci_dev *dev);
 void pci_update_resource(struct pci_dev *dev, int resno);
 int __must_check pci_assign_resource(struct pci_dev *dev, int i);
 int pci_select_bars(struct pci_dev *dev, unsigned long flags);


^ permalink raw reply related

* [PATCH] net: Add getsockopt support for TCP thin-streams
From: Josh Hunt @ 2010-07-30 23:49 UTC (permalink / raw)
  To: davem, kuznet, jmorris, kaber; +Cc: netdev, linux-kernel, juhlenko, apetlund

Initial TCP thin-stream commit did not add getsockopt support for the new
socket options: TCP_THIN_LINEAR_TIMEOUTS and TCP_THIN_DUPACK. This adds support
for them.

Signed-off-by: Josh Hunt <johunt@akamai.com>
---
 net/ipv4/tcp.c |    6 ++++++
 1 files changed, 6 insertions(+), 0 deletions(-)

diff --git a/net/ipv4/tcp.c b/net/ipv4/tcp.c
index 65afeae..3ed3525 100644
--- a/net/ipv4/tcp.c
+++ b/net/ipv4/tcp.c
@@ -2591,6 +2591,12 @@ static int do_tcp_getsockopt(struct sock *sk, int level,
 			return -EFAULT;
 		return 0;
 	}
+	case TCP_THIN_LINEAR_TIMEOUTS:
+		val = tp->thin_lto;
+		break;
+	case TCP_THIN_DUPACK:
+		val = tp->thin_dupack;
+		break;
 	default:
 		return -ENOPROTOOPT;
 	}
-- 
1.7.0.4

^ permalink raw reply related

* where are the memory barriers in net driver rx DMA operations?
From: Chris Friesen @ 2010-07-30 15:19 UTC (permalink / raw)
  To: netdev, Linux Kernel Mailing List, davem

Documentation/DMA-API-HOWTO.txt says that memory barriers are still
required when accessing consistent mappings.  The example they give is
for reordering stores to consistent memory but I assume this also
applies to reordering loads.

However, I see many net drivers accessing the descriptor ring (in
consistent memory), checking the status bit for the buffer, then calling
dma_unmap_single() and accessing the data without any explicit memory
barrier.  Does the unmapping call act as a barrier in this case?

Thanks,

Chris

-- 
Chris Friesen
Software Developer
GENBAND
chris.friesen@genband.com
www.genband.com

^ permalink raw reply

* [072/205] IPv6: only notify protocols if address is completely gone
From: Greg KH @ 2010-07-30 17:51 UTC (permalink / raw)
  To: linux-kernel, stable, Emil S Tantilov, David S. Miller, Greg KH
  Cc: stable-review, torvalds, akpm, alan, NetDev, emil.s.tantilov,
	Stephen Hemminger
In-Reply-To: <20100730175238.GA3924@kroah.com>

2.6.34-stable review patch.  If anyone has any objections, please let us know.

------------------

From: Stephen Hemminger <shemminger@vyatta.com>

(cherry picked from commit 8595805aafc8b077e01804c9a3668e9aa3510e89)

The notifier for address down should only be called if address is completely
gone, not just being marked as tentative on link transition. The code
in net-next would case bonding/sctp/s390 to see address disappear on link
down, but they would never see it reappear on link up.

Signed-off-by: Stephen Hemminger <shemminger@vyatta.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>

---
 net/ipv6/addrconf.c |    4 +++-
 1 file changed, 3 insertions(+), 1 deletion(-)

--- a/net/ipv6/addrconf.c
+++ b/net/ipv6/addrconf.c
@@ -2729,7 +2729,9 @@ static int addrconf_ifdown(struct net_de
 		write_unlock_bh(&idev->lock);

 		__ipv6_ifa_notify(RTM_DELADDR, ifa);
-		atomic_notifier_call_chain(&inet6addr_chain, NETDEV_DOWN, ifa);
+		if (ifa->dead)
+			atomic_notifier_call_chain(&inet6addr_chain,
+						   NETDEV_DOWN, ifa);
 		in6_ifa_put(ifa);

 		write_lock_bh(&idev->lock);

^ permalink raw reply

* [071/205] IPv6: keep route for tentative address
From: Greg KH @ 2010-07-30 17:51 UTC (permalink / raw)
  To: linux-kernel, stable, Emil S Tantilov
  Cc: stable-review, torvalds, akpm, alan, NetDev, Greg KH,
	David S. Miller, emil.s.tantilov, Stephen Hemminger
In-Reply-To: <20100730175238.GA3924@kroah.com>

2.6.34-stable review patch.  If anyone has any objections, please let us know.

------------------

From: Stephen Hemminger <shemminger@vyatta.com>

(cherry picked from commit 93fa159abe50d3c55c7f83622d3f5c09b6e06f4b)

Recent changes preserve IPv6 address when link goes down (good).
But would cause address to point to dead dst entry (bad).
The simplest fix is to just not delete route if address is
being held for later use.

Signed-off-by: Stephen Hemminger <shemminger@vyatta.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>

---
 net/ipv6/addrconf.c |    3 ++-
 1 file changed, 2 insertions(+), 1 deletion(-)

--- a/net/ipv6/addrconf.c
+++ b/net/ipv6/addrconf.c
@@ -4047,7 +4047,8 @@ static void __ipv6_ifa_notify(int event,
 			addrconf_leave_anycast(ifp);
 		addrconf_leave_solict(ifp->idev, &ifp->addr);
 		dst_hold(&ifp->rt->u.dst);
-		if (ip6_del_rt(ifp->rt))
+
+		if (ifp->dead && ip6_del_rt(ifp->rt))
 			dst_free(&ifp->rt->u.dst);
 		break;
 	}

^ permalink raw reply

* Re: [PATCH 9/9] sunrpc: auth_gss: misused copy_to_user() return value
From: Trond Myklebust @ 2010-07-30 17:48 UTC (permalink / raw)
  To: Kulikov Vasiliy
  Cc: kernel-janitors, J. Bruce Fields, Neil Brown, David S. Miller,
	Jeff Layton, Steve Dickson, Suresh Jayaraman, Kevin Coffman,
	linux-nfs, netdev
In-Reply-To: <1280488234-21223-1-git-send-email-segooon@gmail.com>

On Fri, 2010-07-30 at 15:10 +0400, Kulikov Vasiliy wrote:
> copy_to_user() returns nonzero value on error, this value may be any
> value between 0 and requested count, not only requested count.
> 
> Signed-off-by: Kulikov Vasiliy <segooon@gmail.com>
> ---
>  net/sunrpc/auth_gss/auth_gss.c |    2 +-
>  1 files changed, 1 insertions(+), 1 deletions(-)
> 
> diff --git a/net/sunrpc/auth_gss/auth_gss.c b/net/sunrpc/auth_gss/auth_gss.c
> index 8da2a0e..232d7dc 100644
> --- a/net/sunrpc/auth_gss/auth_gss.c
> +++ b/net/sunrpc/auth_gss/auth_gss.c
> @@ -610,7 +610,7 @@ gss_pipe_upcall(struct file *filp, struct rpc_pipe_msg *msg,
>  	unsigned long left;
>  
>  	left = copy_to_user(dst, data, mlen);
> -	if (left == mlen) {
> +	if (left)
>  		msg->errno = -EFAULT;
>  		return -EFAULT;
>  	}

Ditto unnecessary...

Trond

^ permalink raw reply

* Re: e1000e crashes with 2.6.34.x and ThinkPad T60
From: Allan, Bruce W @ 2010-07-30 17:42 UTC (permalink / raw)
  To: Marc Haber
  Cc: e1000-devel@lists.sourceforge.net, Network Developers,
	Linux Kernel Developers, Linux
In-Reply-To: <20100730125614.GB21444@torres.zugschlus.de>

On Friday, July 30, 2010 5:56 AM, Marc Haber wrote:
> On Mon, Jul 26, 2010 at 09:13:45AM -0700, Allan, Bruce W wrote:
>> Adding e1000-devel (the Intel LAN developers list).
>> 
>> Please supply the full dmesg you meant to attach with the original
>> report, as well as the output of lspci -vvv.
> 
> Stupid me.
> 
> Greetings
> Marc

Please also provide an eeprom dump from the wired LOM via 'ethtool -e ethX'.

Thanks,
Bruce.
------------------------------------------------------------------------------
The Palm PDK Hot Apps Program offers developers who use the
Plug-In Development Kit to bring their C/C++ apps to Palm for a share
of $1 Million in cash or HP Products. Visit us here for more details:
http://p.sf.net/sfu/dev2dev-palm
_______________________________________________
E1000-devel mailing list
E1000-devel@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/e1000-devel
To learn more about Intel&#174; Ethernet, visit http://communities.intel.com/community/wired

^ permalink raw reply

* Re: [PATCH 09/11] Removing dead ETRAX_NETWORK_RED_ON_NO_CONNECTION
From: Jesper Nilsson @ 2010-07-30 17:28 UTC (permalink / raw)
  To: Christoph Egger
  Cc: David S. Miller, Jiri Pirko, Eric Dumazet, Tejun Heo,
	netdev@vger.kernel.org, linux-kernel@vger.kernel.org,
	vamos-dev@i4.informatik.uni-erlangen.de
In-Reply-To: <d8e76e2563bb1ced76b1629859a7dc02e8f2523c.1279110895.git.siccegge@cs.fau.de>

On Wed, Jul 14, 2010 at 02:41:20PM +0200, Christoph Egger wrote:
> ETRAX_NETWORK_RED_ON_NO_CONNECTION doesn't exist in Kconfig, therefore
> removing all references for it from the source code.

> Signed-off-by: Christoph Egger <siccegge@cs.fau.de>

Acked-by: Jesper Nilsson <jesper.nilsson@axis.com>

/^JN - Jesper Nilsson
-- 
               Jesper Nilsson -- jesper.nilsson@axis.com

^ permalink raw reply

* Re: [PATCH V4] Export SMBIOS provided firmware instance and label to sysfs
From: Jesse Barnes @ 2010-07-30 16:36 UTC (permalink / raw)
  To: Narendra K
  Cc: netdev, linux-hotplug, linux-pci, matt_domsch, charles_rose,
	jordan_hargrave, vijay_nijhawan
In-Reply-To: <20100726105650.GA19738@auslistsprd01.us.dell.com>

On Mon, 26 Jul 2010 05:56:50 -0500
Narendra K <Narendra_K@dell.com> wrote:

> Hello,
> 
> V3 -> V4:
> 
> Updated the contact field in Documentation/ABI directory.
> 
> Please consider for inclusion -
> 
> From: Narendra K <narendra_k@dell.com>
> Subject: [PATCH] Export SMBIOS provided firmware instance and label to sysfs
> 
> This patch exports SMBIOS provided firmware instance and label
> of onboard pci devices to sysfs
> 

Applied to linux-next, thanks.

-- 
Jesse Barnes, Intel Open Source Technology Center

^ permalink raw reply

* RE: [RFC PATCH v8 00/16] Provide a zero-copy method on KVM virtio-net.
From: Shirley Ma @ 2010-07-30 15:51 UTC (permalink / raw)
  To: Xin, Xiaohui
  Cc: netdev@vger.kernel.org, kvm@vger.kernel.org,
	linux-kernel@vger.kernel.org, mst@redhat.com, mingo@elte.hu,
	davem@davemloft.net, herbert@gondor.apana.org.au,
	jdike@linux.intel.com
In-Reply-To: <F2E9EB7348B8264F86B6AB8151CE2D791BAB307158@shsmsx502.ccr.corp.intel.com>

Hello Xiaohui,

On Fri, 2010-07-30 at 16:53 +0800, Xin, Xiaohui wrote:
> >Since vhost-net already supports macvtap/tun backends, do you think
> >whether it's better to implement zero copy in macvtap/tun than
> inducing
> >a new media passthrough device here?
> >
> 
> I'm not sure if there will be more duplicated code in the kernel. 

I think it should be less duplicated code in the kernel if we use
macvtap to support what media passthrough driver here. Since macvtap has
support virtio_net head and offloading already, the only missing func is
zero copy. Also QEMU supports macvtap, we just need add a zero copy flag
in option.

Thanks
Shirley

^ permalink raw reply

* Re: [RFC PATCH v8 00/16] Provide a zero-copy method on KVM virtio-net.
From: Shirley Ma @ 2010-07-30 15:46 UTC (permalink / raw)
  To: Avi Kivity
  Cc: xiaohui.xin, netdev, kvm, linux-kernel, mst, mingo, davem,
	herbert, jdike
In-Reply-To: <4C525CD2.5080604@redhat.com>

Hello Avi,

On Fri, 2010-07-30 at 08:02 +0300, Avi Kivity wrote:
> get_user_pages() is indeed slow.  But what about
> get_user_pages_fast()?
> 
> Note that when the page is first touched, get_user_pages_fast() falls 
> back to get_user_pages(), so the latency needs to be measured after 
> quite a bit of warm-up. 

Yes, I used get_user_pages_fast, however if falled back to
get_user_pages() when the apps doesn't allocate buffer on the same page.
If I run a single ping, the RTT is extremely high, but when running
multiple pings, the RTT time reduce significantly, but still it is not
as fast as copy from my initial test. I am thinking that we might need
to pre-pin memory pool.

Shirley

^ permalink raw reply

* [PATCH] nf_nat: use local variable hdrlen
From: Changli Gao @ 2010-07-30  7:58 UTC (permalink / raw)
  To: Patrick McHardy; +Cc: David S. Miller, netfilter-devel, netdev, Changli Gao

use local variable hdrlen instead of ip_hdrlen(skb).

Signed-off-by: Changli Gao <xiaosuo@gmail.com>
----
 net/ipv4/netfilter/nf_nat_core.c |   18 +++++++-----------
 1 file changed, 7 insertions(+), 11 deletions(-)
diff --git a/net/ipv4/netfilter/nf_nat_core.c b/net/ipv4/netfilter/nf_nat_core.c
index c7719b2..ca2e7e7 100644
--- a/net/ipv4/netfilter/nf_nat_core.c
+++ b/net/ipv4/netfilter/nf_nat_core.c
@@ -440,7 +440,7 @@ int nf_nat_icmp_reply_translation(struct nf_conn *ct,
 	if (!skb_make_writable(skb, hdrlen + sizeof(*inside)))
 		return 0;
 
-	inside = (void *)skb->data + ip_hdrlen(skb);
+	inside = (void *)skb->data + hdrlen;
 
 	/* We're actually going to mangle it beyond trivial checksum
 	   adjustment, so make sure the current checksum is correct. */
@@ -470,12 +470,10 @@ int nf_nat_icmp_reply_translation(struct nf_conn *ct,
 	/* rcu_read_lock()ed by nf_hook_slow */
 	l4proto = __nf_ct_l4proto_find(PF_INET, inside->ip.protocol);
 
-	if (!nf_ct_get_tuple(skb,
-			     ip_hdrlen(skb) + sizeof(struct icmphdr),
-			     (ip_hdrlen(skb) +
+	if (!nf_ct_get_tuple(skb, hdrlen + sizeof(struct icmphdr),
+			     (hdrlen +
 			      sizeof(struct icmphdr) + inside->ip.ihl * 4),
-			     (u_int16_t)AF_INET,
-			     inside->ip.protocol,
+			     (u_int16_t)AF_INET, inside->ip.protocol,
 			     &inner, l3proto, l4proto))
 		return 0;
 
@@ -484,15 +482,13 @@ int nf_nat_icmp_reply_translation(struct nf_conn *ct,
 	   pass all hooks (locally-generated ICMP).  Consider incoming
 	   packet: PREROUTING (DST manip), routing produces ICMP, goes
 	   through POSTROUTING (which must correct the DST manip). */
-	if (!manip_pkt(inside->ip.protocol, skb,
-		       ip_hdrlen(skb) + sizeof(inside->icmp),
-		       &ct->tuplehash[!dir].tuple,
-		       !manip))
+	if (!manip_pkt(inside->ip.protocol, skb, hdrlen + sizeof(inside->icmp),
+		       &ct->tuplehash[!dir].tuple, !manip))
 		return 0;
 
 	if (skb->ip_summed != CHECKSUM_PARTIAL) {
 		/* Reloading "inside" here since manip_pkt inner. */
-		inside = (void *)skb->data + ip_hdrlen(skb);
+		inside = (void *)skb->data + hdrlen;
 		inside->icmp.checksum = 0;
 		inside->icmp.checksum =
 			csum_fold(skb_checksum(skb, hdrlen,

^ permalink raw reply related

* Re: [PATCH] vhost: locking/rcu cleanup
From: Tejun Heo @ 2010-07-30 14:49 UTC (permalink / raw)
  To: Michael S. Tsirkin
  Cc: David S. Miller, Sridhar Samudrala, Jeff Dike, Juan Quintela,
	Rusty Russell, Takuya Yoshikawa, David Stevens, Paul E. McKenney,
	kvm, virtualization, netdev, linux-kernel
In-Reply-To: <20100729122325.GA24337@redhat.com>

Hello,

On 07/29/2010 02:23 PM, Michael S. Tsirkin wrote:
> I saw WARN_ON(!list_empty(&dev->work_list)) trigger
> so our custom flush is not as airtight as need be.

Could be but it's also possible that something has queued something
after the last flush?  Is the problem reproducible?

> This patch switches to a simple atomic counter + srcu instead of
> the custom locked queue + flush implementation.
> 
> This will slow down the setup ioctls, which should not matter -
> it's slow path anyway. We use the expedited flush to at least
> make sure it has a sane time bound.
> 
> Works fine for me. I got reports that with many guests,
> work lock is highly contended, and this patch should in theory
> fix this as well - but I haven't tested this yet.

Hmmm... vhost_poll_flush() becomes synchronize_srcu_expedited().  Can
you please explain how it works?  synchronize_srcu_expedited() is an
extremely heavy operation involving scheduling the cpu_stop task on
all cpus.  I'm not quite sure whether doing it from every flush is a
good idea.  Is flush supposed to be a very rare operation?

Having custom implementation is fine too but let's try to implement
something generic if at all possible.

Thanks.

-- 
tejun

^ permalink raw reply

* Re: [PATCH repost] sched: export sched_set/getaffinity to modules
From: Tejun Heo @ 2010-07-30 14:31 UTC (permalink / raw)
  To: Oleg Nesterov
  Cc: Michael S. Tsirkin, Sridhar Samudrala, Peter Zijlstra,
	Ingo Molnar, netdev, lkml, kvm@vger.kernel.org, Andrew Morton,
	Dmitri Vorobiev, Jiri Kosina, Thomas Gleixner, Andi Kleen
In-Reply-To: <20100730141901.GA9076@redhat.com>

Hello,

On 07/30/2010 04:19 PM, Oleg Nesterov wrote:
> But I must admit, I personally dislike this idea. A kernel thread which
> is the child of the user-space process, and in fact it is not the "real"
> kernel thread. I think this is against the common case. If you do not
> care the signals/reparenting, why can't you fork the user-space process
> which does all the work via ioctl's ? OK, I do not understand the problem
> domain, probably this can't work.

Having kernel threads which are children of user process is plain
scary considering the many things parent/children relationship implies
and various bugs and security vulnerabilities in the area.  I can't
pinpoint any problem but I think we really shouldn't be adding
something like this for this specific use case.  If we can get away
with exporting a few symbols, let's go that way.

Thanks.

-- 
tejun

^ permalink raw reply

* Re: [PATCH] act_nat: the checksum of ICMP doesn't have pseudo header
From: Herbert Xu @ 2010-07-30 14:30 UTC (permalink / raw)
  To: Changli Gao
  Cc: Jamal Hadi Salim, David S. Miller, netdev, linux-kernel,
	Patrick McHardy
In-Reply-To: <AANLkTimKXsgmOSVmw-cP6Zw6UoK5CSshK38Oo5tLuAeA@mail.gmail.com>

On Fri, Jul 30, 2010 at 10:16:05PM +0800, Changli Gao wrote:
> 
> I know we need to update the ICMP checksum if we alter the payload(the
> inner IP header here) of ICMP. But I doubt if the update is really
> necessary if the checksum is partial, as the   checksum will be done
> later(by ether skb_checksum_help() or NIC hardware). In fact, as there
> isn't any pseudo header, the icmph->checksum should be always ZERO,
> otherwise skb_checksum_help() or NIC will give the wrong checksums,
> when the checksum is partial.

Actually you are right.  I suppose the only reason this has never
shown up is because CHEKSUM_PARTIAL doesn't usually occur with
forwarded packets.

Acked-by: Herbert Xu <herbert@gondor.apana.org.au>

Thanks,
-- 
Email: Herbert Xu <herbert@gondor.apana.org.au>
Home Page: http://gondor.apana.org.au/~herbert/
PGP Key: http://gondor.apana.org.au/~herbert/pubkey.txt

^ permalink raw reply

* Re: [PATCH repost] sched: export sched_set/getaffinity to modules
From: Oleg Nesterov @ 2010-07-30 14:19 UTC (permalink / raw)
  To: Michael S. Tsirkin
  Cc: Sridhar Samudrala, Peter Zijlstra, Tejun Heo, Ingo Molnar, netdev,
	lkml, kvm@vger.kernel.org, Andrew Morton, Dmitri Vorobiev,
	Jiri Kosina, Thomas Gleixner, Andi Kleen
In-Reply-To: <20100727154155.GA13419@redhat.com>

Sorry for the delay, I can't be responsive these days...

On 07/27, Michael S. Tsirkin wrote:
>
> On Mon, Jul 26, 2010 at 08:08:34PM +0200, Oleg Nesterov wrote:
> > On 07/26, Sridhar Samudrala wrote:
> > >
> > > I have been testing out a similar patch that uses kernel_thread() without CLONE_FILES
> > > flag rather than create_kthread() and then closing the files.
> >
> > !CLONE_FILES can't help. copy_files() does dup_fd() in this case.
> > The child still inherits the files.
> >
> > > Either version should be fine.
> >
> > I think neither version is fine ;)
> >
> > exit_files() is not enough too. How about the signals, reparenting?
> >
> >
> > I already forgot all details, probably I missed somethig. But it
> > seems to me that it is better to just export get/set affinity and
> > forget about all complications.
> >
> > Oleg.
>
> Oleg, so can I attach your Ack to the patch in question, and merge
> it all through net-next?

Well, I do not think you need my ack ;)

But I must admit, I personally dislike this idea. A kernel thread which
is the child of the user-space process, and in fact it is not the "real"
kernel thread. I think this is against the common case. If you do not
care the signals/reparenting, why can't you fork the user-space process
which does all the work via ioctl's ? OK, I do not understand the problem
domain, probably this can't work.

Anyway, the patch looks buggy to me. Starting from

	create_kthread(&create);
	wait_for_completion(&create.done);

At least you should check create_kthread() suceeds, otherwise
wait_for_completion() will hang forever. OTOH, if it suceeds then
wait_for_completion() is not needed. But this is minor.

create_kthread()->kernel_thread() uses CLONE_VM, this means that the
child will share ->mm. And this means that if the parent recieves
the coredumping signal it will hang forever in kernel space waiting
until this child exits.

This is just the immediate surprise I can see with this approach,
I am afraid there is something else.

And once again. We are doing this hacks only because we lack a
couples of exports (iiuk). This is, well, a bit strange ;)

Oleg.

^ permalink raw reply

* Re: [PATCH] act_nat: the checksum of ICMP doesn't have pseudo header
From: Changli Gao @ 2010-07-30 14:16 UTC (permalink / raw)
  To: Herbert Xu
  Cc: Jamal Hadi Salim, David S. Miller, netdev, linux-kernel,
	Patrick McHardy
In-Reply-To: <20100730102409.GA8590@gondor.apana.org.au>

On Fri, Jul 30, 2010 at 6:24 PM, Herbert Xu <herbert@gondor.apana.org.au> wrote:
>
> The checksum update is for the inner IP header.  netfilter does
> of course update the checksum, it just doesn't do it here which is
> for the outer IP header.
>

I know we need to update the ICMP checksum if we alter the payload(the
inner IP header here) of ICMP. But I doubt if the update is really
necessary if the checksum is partial, as the   checksum will be done
later(by ether skb_checksum_help() or NIC hardware). In fact, as there
isn't any pseudo header, the icmph->checksum should be always ZERO,
otherwise skb_checksum_help() or NIC will give the wrong checksums,
when the checksum is partial.

-- 
Regards,
Changli Gao(xiaosuo@gmail.com)

^ permalink raw reply

* pcap and ARP hardware type for IEEE 802.15.4
From: Jon Smirl @ 2010-07-30 14:05 UTC (permalink / raw)
  To: linux-zigbee-devel, Netdev

I'm working on pcap support for the native IEEE 802.15.4
implementation in the kernel.
http://sourceforge.net/apps/trac/linux-zigbee/

pcap uses ARPHRD_ to write the DLT value into the pcap file. We need a
new DLT for the internal Linux packets since they have had the two FCS
bytes stripped from the end.

The 802.15.4 implementation has used two different ARPHRD types, one
for the active interface and one for the monitor interface. Has that
been done correctly? Should there be two types?

void ieee802154_monitor_setup(struct net_device *dev)
{
       struct ieee802154_sub_if_data *priv;

       dev->addr_len           = 0;
       dev->features           = NETIF_F_NO_CSUM;
       dev->hard_header_len    = 0;
       dev->needed_tailroom    = 2; /* FCS */
       dev->mtu                = 127;
       dev->tx_queue_len       = 10;
       dev->type               = ARPHRD_IEEE802154_MONITOR;
       dev->flags              = IFF_NOARP | IFF_BROADCAST;
       dev->watchdog_timeo     = 0;

       dev->destructor         = free_netdev;
       dev->netdev_ops         = &ieee802154_monitor_ops;
       dev->ml_priv            = &mac802154_mlme;


void ieee802154_wpan_setup(struct net_device *dev)
{
       struct ieee802154_sub_if_data *priv;

       dev->addr_len           = IEEE802154_ADDR_LEN;
       memset(dev->broadcast, 0xff, IEEE802154_ADDR_LEN);
       dev->features           = NETIF_F_NO_CSUM;
       dev->hard_header_len    = 2 + 1 + 20 + 14;
       dev->header_ops         = &ieee802154_header_ops;
       dev->needed_tailroom    = 2; /* FCS */
       dev->mtu                = 1280;
       dev->tx_queue_len       = 10;
       dev->type               = ARPHRD_IEEE802154;
       dev->flags              = IFF_NOARP | IFF_BROADCAST;
       dev->watchdog_timeo     = 0;

       dev->destructor         = free_netdev;
       dev->netdev_ops         = &ieee802154_wpan_ops;
       dev->ml_priv            = &mac802154_mlme;




-- 
Jon Smirl
jonsmirl@gmail.com

^ permalink raw reply

* [PATCH -staging] slicoss: Remove net_device_stats from the driver's private
From: Denis Kirjanov @ 2010-07-30 13:26 UTC (permalink / raw)
  To: greg; +Cc: gregkh, liodot, dkirjanov, charrer, davem, netdev, linux-kernel

Remove net_device_stats from the driver's private.

Signed-off-by: Denis Kirjanov <dkirjanov@kernel.org>
---
 drivers/staging/slicoss/slicoss.c |   41 +++++++++++++++++++------------------
 2 files changed, 21 insertions(+), 21 deletions(-)

diff --git a/drivers/staging/slicoss/slic.h b/drivers/staging/slicoss/slic.h
index cab1a61..beab400 100644
--- a/drivers/staging/slicoss/slic.h
+++ b/drivers/staging/slicoss/slic.h
@@ -499,7 +499,6 @@ struct adapter {
     struct slic_ifevents  if_events;
     struct slic_stats        inicstats_prev;
     struct slicnet_stats     slic_stats;
-    struct net_device_stats stats;
 };
 
 
diff --git a/drivers/staging/slicoss/slicoss.c b/drivers/staging/slicoss/slicoss.c
index d442e3b..f8c4b12 100644
--- a/drivers/staging/slicoss/slicoss.c
+++ b/drivers/staging/slicoss/slicoss.c
@@ -860,6 +860,7 @@ static void slic_config_clear(struct adapter *adapter)
 static bool slic_mac_filter(struct adapter *adapter,
 			struct ether_header *ether_frame)
 {
+	struct net_device *netdev = adapter->netdev;
 	u32 opts = adapter->macopts;
 	u32 *dhost4 = (u32 *)&ether_frame->ether_dhost[0];
 	u16 *dhost2 = (u16 *)&ether_frame->ether_dhost[4];
@@ -879,7 +880,7 @@ static bool slic_mac_filter(struct adapter *adapter,
 	if (ether_frame->ether_dhost[0] & 0x01) {
 		if (opts & MAC_ALLMCAST) {
 			adapter->rcv_multicasts++;
-			adapter->stats.multicast++;
+			netdev->stats.multicast++;
 			return true;
 		}
 		if (opts & MAC_MCAST) {
@@ -889,7 +890,7 @@ static bool slic_mac_filter(struct adapter *adapter,
 				if (!compare_ether_addr(mcaddr->address,
 							ether_frame->ether_dhost)) {
 					adapter->rcv_multicasts++;
-					adapter->stats.multicast++;
+					netdev->stats.multicast++;
 					return true;
 				}
 				mcaddr = mcaddr->next;
@@ -2199,11 +2200,10 @@ static int slic_debug_card_show(struct seq_file *seq, void *v)
 static int slic_debug_adapter_show(struct seq_file *seq, void *v)
 {
 	struct adapter *adapter = seq->private;
+	struct net_device *netdev = adapter->netdev;
 
-	if ((adapter->netdev) && (adapter->netdev->name)) {
-		seq_printf(seq, "info: interface          : %s\n",
+	seq_printf(seq, "info: interface          : %s\n",
 			    adapter->netdev->name);
-	}
 	seq_printf(seq, "info: status             : %s\n",
 		SLIC_LINKSTATE(adapter->linkstate));
 	seq_printf(seq, "info: port               : %d\n",
@@ -2221,9 +2221,9 @@ static int slic_debug_adapter_show(struct seq_file *seq, void *v)
 	seq_printf(seq, "info: RcvQ current       : %4.4X\n",
 		    adapter->rcvqueue.count);
 	seq_printf(seq, "rx stats: packets                  : %8.8lX\n",
-		    adapter->stats.rx_packets);
+		    netdev->stats.rx_packets);
 	seq_printf(seq, "rx stats: bytes                    : %8.8lX\n",
-		    adapter->stats.rx_bytes);
+		    netdev->stats.rx_bytes);
 	seq_printf(seq, "rx stats: broadcasts               : %8.8X\n",
 		    adapter->rcv_broadcasts);
 	seq_printf(seq, "rx stats: multicasts               : %8.8X\n",
@@ -2237,13 +2237,13 @@ static int slic_debug_adapter_show(struct seq_file *seq, void *v)
 	seq_printf(seq, "rx stats: drops                    : %8.8X\n",
 			(u32) adapter->rcv_drops);
 	seq_printf(seq, "tx stats: packets                  : %8.8lX\n",
-			adapter->stats.tx_packets);
+			netdev->stats.tx_packets);
 	seq_printf(seq, "tx stats: bytes                    : %8.8lX\n",
-			adapter->stats.tx_bytes);
+			netdev->stats.tx_bytes);
 	seq_printf(seq, "tx stats: errors                   : %8.8X\n",
 			(u32) adapter->slic_stats.iface.xmt_errors);
 	seq_printf(seq, "rx stats: multicasts               : %8.8lX\n",
-			adapter->stats.multicast);
+			netdev->stats.multicast);
 	seq_printf(seq, "tx stats: collision errors         : %8.8X\n",
 			(u32) adapter->slic_stats.iface.xmit_collisions);
 	seq_printf(seq, "perf: Max rcv frames/isr           : %8.8X\n",
@@ -2620,13 +2620,14 @@ static void slic_xmit_fail(struct adapter *adapter,
 		}
 	}
 	dev_kfree_skb(skb);
-	adapter->stats.tx_dropped++;
+	adapter->netdev->stats.tx_dropped++;
 }
 
 static void slic_rcv_handle_error(struct adapter *adapter,
 					struct slic_rcvbuf *rcvbuf)
 {
 	struct slic_hddr_wds *hdr = (struct slic_hddr_wds *)rcvbuf->data;
+	struct net_device *netdev = adapter->netdev;
 
 	if (adapter->devid != SLIC_1GB_DEVICE_ID) {
 		if (hdr->frame_status14 & VRHSTAT_802OE)
@@ -2637,15 +2638,15 @@ static void slic_rcv_handle_error(struct adapter *adapter,
 			adapter->if_events.uflow802++;
 		if (hdr->frame_status_b14 & VRHSTATB_RCVE) {
 			adapter->if_events.rcvearly++;
-			adapter->stats.rx_fifo_errors++;
+			netdev->stats.rx_fifo_errors++;
 		}
 		if (hdr->frame_status_b14 & VRHSTATB_BUFF) {
 			adapter->if_events.Bufov++;
-			adapter->stats.rx_over_errors++;
+			netdev->stats.rx_over_errors++;
 		}
 		if (hdr->frame_status_b14 & VRHSTATB_CARRE) {
 			adapter->if_events.Carre++;
-			adapter->stats.tx_carrier_errors++;
+			netdev->stats.tx_carrier_errors++;
 		}
 		if (hdr->frame_status_b14 & VRHSTATB_LONGE)
 			adapter->if_events.Longe++;
@@ -2653,7 +2654,7 @@ static void slic_rcv_handle_error(struct adapter *adapter,
 			adapter->if_events.Invp++;
 		if (hdr->frame_status_b14 & VRHSTATB_CRC) {
 			adapter->if_events.Crc++;
-			adapter->stats.rx_crc_errors++;
+			netdev->stats.rx_crc_errors++;
 		}
 		if (hdr->frame_status_b14 & VRHSTATB_DRBL)
 			adapter->if_events.Drbl++;
@@ -2719,6 +2720,7 @@ static void slic_rcv_handle_error(struct adapter *adapter,
 
 static void slic_rcv_handler(struct adapter *adapter)
 {
+	struct net_device *netdev = adapter->netdev;
 	struct sk_buff *skb;
 	struct slic_rcvbuf *rcvbuf;
 	u32 frames = 0;
@@ -2744,8 +2746,8 @@ static void slic_rcv_handler(struct adapter *adapter)
 		skb_pull(skb, SLIC_RCVBUF_HEADSIZE);
 		rx_bytes = (rcvbuf->length & IRHDDR_FLEN_MSK);
 		skb_put(skb, rx_bytes);
-		adapter->stats.rx_packets++;
-		adapter->stats.rx_bytes += rx_bytes;
+		netdev->stats.rx_packets++;
+		netdev->stats.rx_bytes += rx_bytes;
 #if SLIC_OFFLOAD_IP_CHECKSUM
 		skb->ip_summed = CHECKSUM_UNNECESSARY;
 #endif
@@ -2939,8 +2941,8 @@ static netdev_tx_t slic_xmit_start(struct sk_buff *skb, struct net_device *dev)
 		if (skbtype == NORMAL_ETHFRAME)
 			slic_xmit_build_request(adapter, hcmd, skb);
 	}
-	adapter->stats.tx_packets++;
-	adapter->stats.tx_bytes += skb->len;
+	dev->stats.tx_packets++;
+	dev->stats.tx_bytes += skb->len;
 
 #ifdef DEBUG_DUMP
 	if (adapter->kill_card) {
@@ -2972,7 +2974,6 @@ xmit_fail:
 static void slic_adapter_freeresources(struct adapter *adapter)
 {
 	slic_init_cleanup(adapter);
-	memset(&adapter->stats, 0, sizeof(struct net_device_stats));
 	adapter->error_interrupts = 0;
 	adapter->rcv_interrupts = 0;
 	adapter->xmit_interrupts = 0;

^ permalink raw reply related

* Re: RPS vs. hard-irq-context netif_rx()
From: Johannes Berg @ 2010-07-30 13:21 UTC (permalink / raw)
  To: netdev
In-Reply-To: <1280495027.3710.13.camel@jlt3.sipsolutions.net>

On Fri, 2010-07-30 at 15:03 +0200, Johannes Berg wrote:

> The reason seems to be that
>  * RCU_TREE uses local_bh_disable/enable
>  * CONFIG_RPS uses RCU within netif_rx()
>  * the driver I'm using calls netif_rx() within the irq context
> 
> So .. where's the bug? I'd point to CONFIG_RPS since it's newest.

Disabling RPS gives me the same result due to netpoll since

commit de85d99eb7b595f6751550184b94c1e2f74a828b
Author: Herbert Xu <herbert@gondor.apana.org.au>
Date:   Thu Jun 10 16:12:44 2010 +0000

    netpoll: Fix RCU usage

Disabling netpoll as well gets me a warning-free boot, but I suspect
that's not really what it should be like...

johannes

^ permalink raw reply

* RPS vs. hard-irq-context netif_rx()
From: Johannes Berg @ 2010-07-30 13:03 UTC (permalink / raw)
  To: netdev

I got the following on a kvm instance I use for testing:

[   51.358803] WARNING: at kernel/lockdep.c:2327 trace_hardirqs_on_caller+0x167/0x200()
[   51.360210] Hardware name: Bochs
[   51.360210] Modules linked in:
[   51.360210] Pid: 1546, comm: dhclient3 Tainted: G        W   2.6.35-rc6-wl+ #588
[   51.360210] Call Trace:
[   51.360210]  <IRQ>  [<ffffffff8104e87f>] warn_slowpath_common+0x7f/0xc0
[   51.360210]  [<ffffffff8104e8da>] warn_slowpath_null+0x1a/0x20
[   51.360210]  [<ffffffff81087977>] trace_hardirqs_on_caller+0x167/0x200
[   51.360210]  [<ffffffff81087a1d>] trace_hardirqs_on+0xd/0x10
[   51.360210]  [<ffffffff81056ade>] local_bh_enable+0x9e/0x130
[   51.360210]  [<ffffffff8139fa82>] netif_rx+0xc2/0x250
[   51.360210]  [<ffffffff813146d7>] ei_receive+0x1b7/0x2c0
[   51.360210]  [<ffffffff81314ca2>] __ei_interrupt+0x282/0x360
[   51.360210]  [<ffffffff81314dce>] ei_interrupt+0xe/0x10
[   51.360210]  [<ffffffff810abe85>] handle_IRQ_event+0x85/0x300
[   51.360210]  [<ffffffff810aead5>] handle_level_irq+0x95/0x120
[   51.360210]  [<ffffffff81005a12>] handle_irq+0x22/0x30
[   51.360210]  [<ffffffff814394b3>] do_IRQ+0x73/0xf0
...


The reason seems to be that
 * RCU_TREE uses local_bh_disable/enable
 * CONFIG_RPS uses RCU within netif_rx()
 * the driver I'm using calls netif_rx() within the irq context

So .. where's the bug? I'd point to CONFIG_RPS since it's newest.

johannes


^ permalink raw reply

page: next (older) | prev (newer) | latest
- recent:[subjects (threaded)|topics (new)|topics (active)]

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox