Netdev List

Netdev List
 help / color / mirror / Atom feed

* [2.6 patch] rtnetlink.c: #if 0 no longer used functions
From: Adrian Bunk @ 2008-01-30 20:02 UTC (permalink / raw)
  To: Patrick McHardy, David S. Miller; +Cc: netdev

This patch #if 0's the following no longer used functions:
- rtattr_parse()
- rtattr_strlcpy()
- __rtattr_parse_nested_compat()

Signed-off-by: Adrian Bunk <bunk@kernel.org>

---

 include/linux/rtnetlink.h |   12 ------------
 net/core/rtnetlink.c      |    9 ++++++---
 2 files changed, 6 insertions(+), 15 deletions(-)

06cd9ace5f9ca3d8070364d33ca76d1fa4cd203b 
diff --git a/include/linux/rtnetlink.h b/include/linux/rtnetlink.h
index b014f6b..b9e1740 100644
--- a/include/linux/rtnetlink.h
+++ b/include/linux/rtnetlink.h
@@ -602,24 +602,12 @@ struct tcamsg
 
 #include <linux/mutex.h>
 
-extern size_t rtattr_strlcpy(char *dest, const struct rtattr *rta, size_t size);
 static __inline__ int rtattr_strcmp(const struct rtattr *rta, const char *str)
 {
 	int len = strlen(str) + 1;
 	return len > rta->rta_len || memcmp(RTA_DATA(rta), str, len);
 }
 
-extern int rtattr_parse(struct rtattr *tb[], int maxattr, struct rtattr *rta, int len);
-extern int __rtattr_parse_nested_compat(struct rtattr *tb[], int maxattr,
-				        struct rtattr *rta, int len);
-
-#define rtattr_parse_nested(tb, max, rta) \
-	rtattr_parse((tb), (max), RTA_DATA((rta)), RTA_PAYLOAD((rta)))
-
-#define rtattr_parse_nested_compat(tb, max, rta, data, len) \
-({	data = RTA_PAYLOAD(rta) >= len ? RTA_DATA(rta) : NULL; \
-	__rtattr_parse_nested_compat(tb, max, rta, len); })
-
 extern int rtnetlink_send(struct sk_buff *skb, struct net *net, u32 pid, u32 group, int echo);
 extern int rtnl_unicast(struct sk_buff *skb, struct net *net, u32 pid);
 extern int rtnl_notify(struct sk_buff *skb, struct net *net, u32 pid, u32 group,
diff --git a/net/core/rtnetlink.c b/net/core/rtnetlink.c
index ddbdde8..a689f17 100644
--- a/net/core/rtnetlink.c
+++ b/net/core/rtnetlink.c
@@ -82,6 +82,8 @@ int rtnl_trylock(void)
 	return mutex_trylock(&rtnl_mutex);
 }
 
+#if 0
+
 int rtattr_parse(struct rtattr *tb[], int maxattr, struct rtattr *rta, int len)
 {
 	memset(tb, 0, sizeof(struct rtattr*)*maxattr);
@@ -108,6 +110,8 @@ int __rtattr_parse_nested_compat(struct rtattr *tb[], int maxattr,
 	return 0;
 }
 
+#endif  /*  0  */
+
 static struct rtnl_link *rtnl_msg_handlers[NPROTO];
 
 static inline int rtm_msgindex(int msgtype)
@@ -442,6 +446,7 @@ void __rta_fill(struct sk_buff *skb, int attrtype, int attrlen, const void *data
 	memset(RTA_DATA(rta) + attrlen, 0, RTA_ALIGN(size) - size);
 }
 
+#if 0
 size_t rtattr_strlcpy(char *dest, const struct rtattr *rta, size_t size)
 {
 	size_t ret = RTA_PAYLOAD(rta);
@@ -456,6 +461,7 @@ size_t rtattr_strlcpy(char *dest, const struct rtattr *rta, size_t size)
 	}
 	return ret;
 }
+#endif  /*  0  */
 
 int rtnetlink_send(struct sk_buff *skb, struct net *net, u32 pid, unsigned group, int echo)
 {
@@ -1411,9 +1417,6 @@ void __init rtnetlink_init(void)
 }
 
 EXPORT_SYMBOL(__rta_fill);
-EXPORT_SYMBOL(rtattr_strlcpy);
-EXPORT_SYMBOL(rtattr_parse);
-EXPORT_SYMBOL(__rtattr_parse_nested_compat);
 EXPORT_SYMBOL(rtnetlink_put_metrics);
 EXPORT_SYMBOL(rtnl_lock);
 EXPORT_SYMBOL(rtnl_trylock);


^ permalink raw reply related

* [2.6 patch] make nf_ct_path[] static
From: Adrian Bunk @ 2008-01-30 20:02 UTC (permalink / raw)
  To: Pavel Emelyanov, Patrick McHardy, David S. Miller
  Cc: netfilter-devel, netdev, linux-kernel

This patch makes the needlessly global nf_ct_path[] static.

Signed-off-by: Adrian Bunk <bunk@kernel.org>

---
6396fbcebe3eb61f7e6eb1a671920a515912b005 
diff --git a/net/netfilter/nf_conntrack_standalone.c b/net/netfilter/nf_conntrack_standalone.c
index 696074a..5bd38a6 100644
--- a/net/netfilter/nf_conntrack_standalone.c
+++ b/net/netfilter/nf_conntrack_standalone.c
@@ -380,7 +380,7 @@ static ctl_table nf_ct_netfilter_table[] = {
 	{ .ctl_name = 0 }
 };

-struct ctl_path nf_ct_path[] = {
+static struct ctl_path nf_ct_path[] = {
 	{ .procname = "net", .ctl_name = CTL_NET, },
 	{ }
 };

^ permalink raw reply related

* [2.6 patch] e1000e/ethtool.c: make a function static
From: Adrian Bunk @ 2008-01-30 20:02 UTC (permalink / raw)
  To: Joe Perches, Auke Kok, Jeff Garzik, jesse.brandeburg,
	jeffrey.t.kirsher, john.ronciak
  Cc: e1000-devel, netdev, linux-kernel

This patch makes the needlessly global reg_pattern_test_array() static.

Signed-off-by: Adrian Bunk <bunk@kernel.org>

---
ed72e457f06311390d9a9e51a00c904939466aff 
diff --git a/drivers/net/e1000e/ethtool.c b/drivers/net/e1000e/ethtool.c
index 6d9c27f..a2034cf 100644
--- a/drivers/net/e1000e/ethtool.c
+++ b/drivers/net/e1000e/ethtool.c
@@ -690,8 +690,8 @@ err_setup:
 	return err;
 }
 
-bool reg_pattern_test_array(struct e1000_adapter *adapter, u64 *data,
-			    int reg, int offset, u32 mask, u32 write)
+static bool reg_pattern_test_array(struct e1000_adapter *adapter, u64 *data,
+				   int reg, int offset, u32 mask, u32 write)
 {
 	int i;
 	u32 read;


-------------------------------------------------------------------------
This SF.net email is sponsored by: Microsoft
Defy all challenges. Microsoft(R) Visual Studio 2008.
http://clk.atdmt.com/MRT/go/vse0120000070mrt/direct/01/

^ permalink raw reply related

* [2.6 patch] drivers/net/sunvnet.c:print_version() must be __devinit
From: Adrian Bunk @ 2008-01-30 20:03 UTC (permalink / raw)
  To: davem, Jeff Garzik; +Cc: netdev, linux-kernel

This patch fixes the following section mismatches:

<--  snip  -->

...
WARNING: drivers/net/sunvnet.o(.text+0x220): Section mismatch in reference from the function print_version() to the variable .devinit.data:version
WARNING: drivers/net/sunvnet.o(.text+0x228): Section mismatch in reference from the function print_version() to the variable .devinit.data:version
...

<--  snip  -->

Signed-off-by: Adrian Bunk <bunk@kernel.org>

---
f865222179806c4475cd79c2fb92ec622f88da3f 
diff --git a/drivers/net/sunvnet.c b/drivers/net/sunvnet.c
index 4a0035f..6415ce1 100644
--- a/drivers/net/sunvnet.c
+++ b/drivers/net/sunvnet.c
@@ -1130,7 +1130,7 @@ static struct vio_driver_ops vnet_vio_ops = {
 	.handshake_complete	= vnet_handshake_complete,
 };

-static void print_version(void)
+static void __devinit print_version(void)
 {
 	static int version_printed;

^ permalink raw reply related

* Re: [2.6 patch] make nf_ct_path[] static
From: Patrick McHardy @ 2008-01-30 20:02 UTC (permalink / raw)
  To: Adrian Bunk
  Cc: Pavel Emelyanov, David S. Miller, netfilter-devel, netdev,
	linux-kernel
In-Reply-To: <20080130200259.GE29368@does.not.exist>

Adrian Bunk wrote:
> This patch makes the needlessly global nf_ct_path[] static.


I already have this queued.

^ permalink raw reply

* [2.6 patch] net/sunbmac.c section fix
From: Adrian Bunk @ 2008-01-30 20:03 UTC (permalink / raw)
  To: Jeff Garzik; +Cc: netdev, linux-kernel

This patch fixes the following section mismatch:

<--  snip  -->

...
WARNING: drivers/net/sunbmac.o(.devinit.text+0x24): Section mismatch in reference from the function bigmac_sbus_probe() to the function .init.text:bigmac_ether_init()
...

<--  snip  -->

Signed-off-by: Adrian Bunk <bunk@kernel.org>

---
c4238b1ec3c23ec9dbe8b4da932cfd381ef0f376 
diff --git a/drivers/net/sunbmac.c b/drivers/net/sunbmac.c
index fe3ac6f..0e4a88d 100644
--- a/drivers/net/sunbmac.c
+++ b/drivers/net/sunbmac.c
@@ -1075,7 +1075,7 @@ static const struct ethtool_ops bigmac_ethtool_ops = {
 	.get_link		= bigmac_get_link,
 };
 
-static int __init bigmac_ether_init(struct sbus_dev *qec_sdev)
+static int __devinit bigmac_ether_init(struct sbus_dev *qec_sdev)
 {
 	struct net_device *dev;
 	static int version_printed;


^ permalink raw reply related

* [2.6 patch] net/sunqe.c section fix
From: Adrian Bunk @ 2008-01-30 20:03 UTC (permalink / raw)
  To: Jeff Garzik; +Cc: netdev, linux-kernel

This patch fixes the following section mismatch:

<--  snip  -->

...
WARNING: drivers/net/sunqe.o(.devinit.text+0x4): Section mismatch in reference from the function qec_sbus_probe() to the function .init.text:qec_ether_init()
...

<--  snip  -->

Signed-off-by: Adrian Bunk <bunk@kernel.org>

---

 drivers/net/sunqe.c |    6 +++---
 1 file changed, 3 insertions(+), 3 deletions(-)

bee65cb0dd698bbda02b1087bffed51e3a2488cb 
diff --git a/drivers/net/sunqe.c b/drivers/net/sunqe.c
index ff23c64..e811331 100644
--- a/drivers/net/sunqe.c
+++ b/drivers/net/sunqe.c
@@ -747,7 +747,7 @@ static inline void qec_init_once(struct sunqec *qecp, struct sbus_dev *qsdev)
 		    qecp->gregs + GLOB_RSIZE);
 }
 
-static u8 __init qec_get_burst(struct device_node *dp)
+static u8 __devinit qec_get_burst(struct device_node *dp)
 {
 	u8 bsizes, bsizes_more;
 
@@ -767,7 +767,7 @@ static u8 __init qec_get_burst(struct device_node *dp)
 	return bsizes;
 }
 
-static struct sunqec * __init get_qec(struct sbus_dev *child_sdev)
+static struct sunqec * __devinit get_qec(struct sbus_dev *child_sdev)
 {
 	struct sbus_dev *qec_sdev = child_sdev->parent;
 	struct sunqec *qecp;
@@ -823,7 +823,7 @@ fail:
 	return NULL;
 }
 
-static int __init qec_ether_init(struct sbus_dev *sdev)
+static int __devinit qec_ether_init(struct sbus_dev *sdev)
 {
 	static unsigned version_printed;
 	struct net_device *dev;


^ permalink raw reply related

* Re: [2.6 patch] rtnetlink.c: #if 0 no longer used functions
From: Patrick McHardy @ 2008-01-30 20:04 UTC (permalink / raw)
  To: Adrian Bunk; +Cc: David S. Miller, netdev
In-Reply-To: <20080130200257.GD29368@does.not.exist>

Adrian Bunk wrote:
> This patch #if 0's the following no longer used functions:
> - rtattr_parse()
> - rtattr_strlcpy()
> - __rtattr_parse_nested_compat()
>   

Please remove them instead.

^ permalink raw reply

* rtl8150: use default MTU of 1500
From: Lennert Buytenhek @ 2008-01-30 19:37 UTC (permalink / raw)
  To: netdev, jgarzik; +Cc: Petko Manolov

The RTL8150 driver uses an MTU of 1540 by default, which causes a
bunch of problems -- it prevents booting from NFS root, for one.

Signed-off-by: Lennert Buytenhek <buytenh@wantstofly.org>
Cc: Petko Manolov <petkan@nucleusys.com>

--- linux-2.6.24-git7.orig/drivers/net/usb/rtl8150.c	2008-01-24 23:58:37.000000000 +0100
+++ linux-2.6.24-git7/drivers/net/usb/rtl8150.c	2008-01-30 20:29:00.000000000 +0100
@@ -925,9 +925,8 @@
 	netdev->hard_start_xmit = rtl8150_start_xmit;
 	netdev->set_multicast_list = rtl8150_set_multicast;
 	netdev->set_mac_address = rtl8150_set_mac_address;
 	netdev->get_stats = rtl8150_netdev_stats;
-	netdev->mtu = RTL8150_MTU;
 	SET_ETHTOOL_OPS(netdev, &ops);
 	dev->intr_interval = 100;	/* 100ms */
 
 	if (!alloc_all_urbs(dev)) {

^ permalink raw reply

* Re: Compex FreedomLine 32 PnP-PCI2 broken with de2104x
From: Ondrej Zary @ 2008-01-30 20:23 UTC (permalink / raw)
  To: jgarzik; +Cc: Linux Kernel, netdev
In-Reply-To: <200801262158.12016.linux@rainbow-software.org>

On Saturday 26 January 2008 21:58:10 Ondrej Zary wrote:
> Hello,
> I was having problems with these FreedomLine cards with Linux before but
> tested it thoroughly today. This card uses DEC 21041 chip and has TP and
> BNC connectors:
>
> 00:12.0 Ethernet controller [0200]: Digital Equipment Corporation DECchip
> 21041 [Tulip Pass 3] [1011:0014] (rev 21)
>
>
> de2104x driver was loaded automatically by udev and card seemed to work.
> Until I disconnected the TP cable and putting it back after a while. The
> driver then switched to (non-existing) AUI port and remained there. I tried
> to set media to TP using ethtool - and the whole kernel crashed because of
> BUG_ON(de_is_running(de));
> in de_set_media(). Seems that the driver is unable to stop the DMA in
> de_stop_rxtx().
>
> I commented out AUI detection in the driver - this time it switched to BNC
> after unplugging the cable and remained there. I also attempted to reset
> the chip when de_stop_rxtx failed but failed to do it.
>
> Then I found that there's de4x5 driver which supports the same cards as
> de2104x (and some other too) - and this one works fine! I can plug and
> unplug the cable and even change between TP and BNC ports just by
> unplugging one and plugging the other cable in. Unfortunately, this driver
> is blacklisted by default - at least in Slackware and Debian.
>
> The question is: why does de2104x exist? Does it work better with some
> hardware?
>
> BTW. Found that the problem exist at least since 2003:
> http://oss.sgi.com/archives/netdev/2003-08/msg00951.html

Does the de2104x driver work correctly for anyone?

-- 
Ondrej Zary

^ permalink raw reply

* Re: [PATCH net-2.6.25 4/7][ATM]: [br2864] routed support
From: chas williams - CONTRACTOR @ 2008-01-30 20:49 UTC (permalink / raw)
  To: Chung-Chi Lo; +Cc: netdev, davem
In-Reply-To: <3f696b20801282342q43537963ueb8fb9cf3389ac03@mail.gmail.com>

In message <3f696b20801282342q43537963ueb8fb9cf3389ac03@mail.gmail.com>,"Chung-
Chi Lo" writes:
>> +       } else { /* vc-mux */
>> +               if (brdev->payload == p_routed) {
>
>add line
>
>			skb->protocol = __constant_htons(ETH_P_IP);
>
>here just like LLC did?
>
>> +                       skb_reset_network_header(skb);
>> +                       skb->pkt_type = PACKET_HOST;

yes, that is missing but it needs to be a little more complicated than
that i think.  you need to examine the first byte to see if its an
ipv4 or ipv6 datagram.  something like:

			struct iphdr *iph = skb_network_header(skb);

                        skb_reset_network_header(skb);
			iph = skb_network_header(skb);
			if (iph->version == 4)
				skb->protocol = __constant_htons(ETH_P_IP);
			else if (iph->version == 6)
				skb->protocol = __constant_htons(ETH_P_IPV6);
			else
				/* drop the packet */
                        skb->pkt_type = PACKET_HOST;

how does that look?

>+       } else {
>+               skb_push(skb, 2);
>+               if (brdev->payload == p_bridged)
>+                       memset(skb->data, 0, 2);
>+       }
>
>Here should be
>
>	} else {
>		if (brdev->payload == p_bridged) {
>			skb_push(skb, 2);
>			memset(skb->data, 0, 2);
>		}
>	}
>
>Because VCMUX and routed mode doesn't need two bytes in header.

yeah, another oversight.  your fix is correct. i bet you have guessed
that we dont use vc multiplexing.

^ permalink raw reply

* Re: [2.6 patch] unexport sysctl_tcp_tso_win_divisor
From: Ilpo Järvinen @ 2008-01-30 21:05 UTC (permalink / raw)
  To: Adrian Bunk; +Cc: David S. Miller, Netdev
In-Reply-To: <20080130200232.GU29368@does.not.exist>

[-- Attachment #1: Type: TEXT/PLAIN, Size: 763 bytes --]

On Wed, 30 Jan 2008, Adrian Bunk wrote:

> This patch removes the no longer used 
> EXPORT_SYMBOL(sysctl_tcp_tso_win_divisor).
> 
> Signed-off-by: Adrian Bunk <bunk@kernel.org>
>
> ---
> 4884e7997ba5f63f2efeaeead21ed2768fb3f4de 
> diff --git a/net/ipv4/tcp_output.c b/net/ipv4/tcp_output.c
> index 89f0188..ed750f9 100644
> --- a/net/ipv4/tcp_output.c
> +++ b/net/ipv4/tcp_output.c
> @@ -2564,5 +2564,4 @@ EXPORT_SYMBOL(tcp_connect);
>  EXPORT_SYMBOL(tcp_make_synack);
>  EXPORT_SYMBOL(tcp_simple_retransmit);
>  EXPORT_SYMBOL(tcp_sync_mss);
> -EXPORT_SYMBOL(sysctl_tcp_tso_win_divisor);
>  EXPORT_SYMBOL(tcp_mtup_init);

...yes, results from the recent move of tcp_is_cwnd_limited() away from 
tcp.h.

Acked-by: Ilpo Järvinen <ilpo.jarvinen@helsinki.fi>

-- 
 i.

^ permalink raw reply

* Re: [PATCH] Add another Prism2 card to hostap
From: Pavel Roskin @ 2008-01-30 21:15 UTC (permalink / raw)
  To: Marcin Juszkiewicz
  Cc: linux-wireless-u79uwXL29TY76Z2rM5mHXA, Jouni Malinen,
	netdev-u79uwXL29TY76Z2rM5mHXA
In-Reply-To: <200801290942.56258.openembedded-sbG99h22wavzs492ZWNkEA@public.gmane.org>


On Tue, 2008-01-29 at 09:42 +0100, Marcin Juszkiewicz wrote:
> From: Marcin Juszkiewicz <openembedded-sbG99h22wavzs492ZWNkEA@public.gmane.org>
> 
> Card reported by Ångström user:
> http://bugs.openembedded.net/show_bug.cgi?id=3236
> 
> Socket 1:
>    product info: "Wireless LAN", "11Mbps PC Card", "Version 01.02", ""
>    manfid: 0x0156, 0x0002
>    function: 6 (network)
> 
> Signed-off-by: Marcin Juszkiewicz <openembedded-sbG99h22wavzs492ZWNkEA@public.gmane.org>

Acked-by: Pavel Roskin <proski-mXXj517/zsQ@public.gmane.org>

> diff --git a/drivers/net/wireless/hostap/hostap_cs.c b/drivers/net/wireless/hostap/hostap_cs.c
> index 877d3bd..4f6707c 100644
> --- a/drivers/net/wireless/hostap/hostap_cs.c
> +++ b/drivers/net/wireless/hostap/hostap_cs.c
> @@ -894,6 +894,9 @@ static struct pcmcia_device_id hostap_cs_ids[] = {
>                 "The Linksys Group, Inc.", "Wireless Network CF Card", "ISL37300P",
>                 "RevA",
>                 0xa5f472c2, 0x9c05598d, 0xc9049a39, 0x57a66194),
> +       PCMCIA_DEVICE_PROD_ID123(
> +           "Wireless LAN" , "11Mbps PC Card", "Version 01.02",
> +               0x4b8870ff, 0x70e946d1, 0x4b74baa0),
>         PCMCIA_DEVICE_NULL
>  };
>  MODULE_DEVICE_TABLE(pcmcia, hostap_cs_ids);
> 
-- 
Regards,
Pavel Roskin

^ permalink raw reply

* Re: [2.6 patch] e1000e/ethtool.c: make a function static
From: Kok, Auke @ 2008-01-30 21:51 UTC (permalink / raw)
  To: Adrian Bunk
  Cc: Jeff Garzik, e1000-devel, netdev, jesse.brandeburg, linux-kernel,
	john.ronciak, jeffrey.t.kirsher, Joe Perches
In-Reply-To: <20080130200243.GY29368@does.not.exist>

Adrian Bunk wrote:
> This patch makes the needlessly global reg_pattern_test_array() static.
> 
> Signed-off-by: Adrian Bunk <bunk@kernel.org>

stephen hemminger already pointed this out to me... I'll certainly push this
upstream, thanks Adrian!

Auke


> 
> ---
> ed72e457f06311390d9a9e51a00c904939466aff 
> diff --git a/drivers/net/e1000e/ethtool.c b/drivers/net/e1000e/ethtool.c
> index 6d9c27f..a2034cf 100644
> --- a/drivers/net/e1000e/ethtool.c
> +++ b/drivers/net/e1000e/ethtool.c
> @@ -690,8 +690,8 @@ err_setup:
>  	return err;
>  }
>  
> -bool reg_pattern_test_array(struct e1000_adapter *adapter, u64 *data,
> -			    int reg, int offset, u32 mask, u32 write)
> +static bool reg_pattern_test_array(struct e1000_adapter *adapter, u64 *data,
> +				   int reg, int offset, u32 mask, u32 write)
>  {
>  	int i;
>  	u32 read;
> 


-------------------------------------------------------------------------
This SF.net email is sponsored by: Microsoft
Defy all challenges. Microsoft(R) Visual Studio 2008.
http://clk.atdmt.com/MRT/go/vse0120000070mrt/direct/01/

^ permalink raw reply

* Re: [2.6 patch] make e1000_dump_eeprom() static
From: Kok, Auke @ 2008-01-30 21:52 UTC (permalink / raw)
  To: Adrian Bunk
  Cc: Jeff Garzik, e1000-devel, netdev, jesse.brandeburg, linux-kernel,
	john.ronciak, jeffrey.t.kirsher
In-Reply-To: <20080130200245.GZ29368@does.not.exist>

Adrian Bunk wrote:
> This patch makes the needlessly global e1000_dump_eeprom() static.
> 
> Signed-off-by: Adrian Bunk <bunk@kernel.org>

yes, thanks, I'll push it to Jeff.

Auke

> 
> ---
> b5fd924a1388d4aaa94cf05e42e317c2b1fb5748 
> diff --git a/drivers/net/e1000/e1000_main.c b/drivers/net/e1000/e1000_main.c
> index 7f5b2ae..8a6645b 100644
> --- a/drivers/net/e1000/e1000_main.c
> +++ b/drivers/net/e1000/e1000_main.c
> @@ -820,7 +820,7 @@ e1000_reset(struct e1000_adapter *adapter)
>  /**
>   *  Dump the eeprom for users having checksum issues
>   **/
> -void e1000_dump_eeprom(struct e1000_adapter *adapter)
> +static void e1000_dump_eeprom(struct e1000_adapter *adapter)
>  {
>  	struct net_device *netdev = adapter->netdev;
>  	struct ethtool_eeprom eeprom;
> 


-------------------------------------------------------------------------
This SF.net email is sponsored by: Microsoft
Defy all challenges. Microsoft(R) Visual Studio 2008.
http://clk.atdmt.com/MRT/go/vse0120000070mrt/direct/01/

^ permalink raw reply

* Re: e1000 full-duplex TCP performance well below wire speed
From: Bruce Allen @ 2008-01-30 22:25 UTC (permalink / raw)
  To: Linux Kernel Mailing List, netdev, Stephen Hemminger
In-Reply-To: <20080130082136.1017631d@deepthought>

Hi Stephen,

Thanks for your helpful reply and especially for the literature pointers.

>> Indeed, we are not asking to see 1000 Mb/s.  We'd be happy to see 900
>> Mb/s.
>>
>> Netperf is trasmitting a large buffer in MTU-sized packets (min 1500
>> bytes).  Since the acks are only about 60 bytes in size, they should be
>> around 4% of the total traffic.  Hence we would not expect to see more
>> than 960 Mb/s.

> Don't forget the network overhead: http://sd.wareonearth.com/~phil/net/overhead/
> Max TCP Payload data rates over ethernet:
>  (1500-40)/(38+1500) = 94.9285 %  IPv4, minimal headers
>  (1500-52)/(38+1500) = 94.1482 %  IPv4, TCP timestamps

Yes.  If you look further down the page, you will see that with jumbo 
frames (which we have also tried) on Gb/s ethernet the maximum throughput 
is:

   (9000-20-20-12)/(9000+14+4+7+1+12)*1000000000/1000000 = 990.042 Mbps

We are very far from this number -- averaging perhaps 600 or 700 Mbps.

> I believe what you are seeing is an effect that occurs when using
> cubic on links with no other idle traffic. With two flows at high speed,
> the first flow consumes most of the router buffer and backs off gradually,
> and the second flow is not very aggressive.  It has been discussed
> back and forth between TCP researchers with no agreement, one side
> says that it is unfairness and the other side says it is not a problem in
> the real world because of the presence of background traffic.

At least in principle, we should have NO congestion here.  We have ports 
on two different machines wired with a crossover cable.  Box A can not 
transmit faster than 1 Gb/s.  Box B should be able to receive that data 
without dropping packets.  It's not doing anything else!

> See:
>  http://www.hamilton.ie/net/pfldnet2007_cubic_final.pdf
>  http://www.csc.ncsu.edu/faculty/rhee/Rebuttal-LSM-new.pdf

This is extremely helpful.  The typical oscillation (startup) period shown 
in the plots in these papers is of order 10 seconds, which is similar to 
the types of oscillation periods that we are seeing.

*However* we have also seen similar behavior with the Reno congestion 
control algorithm.  So this might not be due to cubic, or entirely due to 
cubic.

In our application (cluster computing) we use a very tightly coupled 
high-speed low-latency network.  There is no 'wide area traffic'.  So it's 
hard for me to understand why any networking components or software layers 
should take more than milliseconds to ramp up or back off in speed. 
Perhaps we should be asking for a TCP congestion avoidance algorithm which 
is designed for a data center environment where there are very few hops 
and typical packet delivery times are tens or hundreds of microseconds. 
It's very different than delivering data thousands of km across a WAN.

Cheers,
 	Bruce

^ permalink raw reply

* Re: e1000 full-duplex TCP performance well below wire speed
From: Bruce Allen @ 2008-01-30 22:33 UTC (permalink / raw)
  To: Ben Greear; +Cc: netdev, Carsten Aulbert, Henning Fehrmann, Bruce Allen
In-Reply-To: <47A0CD3B.3050502@candelatech.com>

Hi Ben,

Thank you for the suggestions and questions.

>> We've connected a pair of modern high-performance boxes with integrated 
>> copper Gb/s Intel NICS, with an ethernet crossover cable, and have run some 
>> netperf full duplex TCP tests.  The transfer rates are well below wire 
>> speed.  We're reporting this as a kernel bug, because we expect a vanilla 
>> kernel with default settings to give wire speed (or close to wire speed) 
>> performance in this case. We DO see wire speed in simplex transfers. The 
>> behavior has been verified on multiple machines with identical hardware.
>
> Try using NICs in the pci-e slots.  We have better luck there, as you 
> usually have more lanes and/or higher quality NIC chipsets available in 
> this case.

It's a good idea.  We can try this, though it will take a little time to 
organize.

> Try a UDP test to make sure the NIC can actually handle the throughput.

I should have mentioned this in my original post -- we already did this.

We can run UDP wire speed full duplex (over 900 Mb/s in each direction, at 
the same time). So the problem stems from TCP or is aggravated by TCP. 
It's not a hardware limitation.

> Look at the actual link usage as reported by the ethernet driver so that 
> you take all of the ACKS and other overhead into account.

OK.  We'll report on this as soon as possible.

> Try the same test using 10G hardware (CX4 NICs are quite affordable 
> these days, and we drove a 2-port 10G NIC based on the Intel ixgbe 
> chipset at around 4Gbps on two ports, full duplex, using pktgen). As in 
> around 16Gbps throughput across the busses.  That may also give you an 
> idea if the bottleneck is hardware or software related.

OK.  That will take more time to organize.

Cheers,
 	Bruce
>

^ permalink raw reply

* Re: e1000 full-duplex TCP performance well below wire speed
From: Stephen Hemminger @ 2008-01-30 22:33 UTC (permalink / raw)
  To: Bruce Allen; +Cc: Linux Kernel Mailing List, netdev
In-Reply-To: <Pine.LNX.4.63.0801301610240.19938@trinity.phys.uwm.edu>

On Wed, 30 Jan 2008 16:25:12 -0600 (CST)
Bruce Allen <ballen@gravity.phys.uwm.edu> wrote:

> Hi Stephen,
> 
> Thanks for your helpful reply and especially for the literature pointers.
> 
> >> Indeed, we are not asking to see 1000 Mb/s.  We'd be happy to see 900
> >> Mb/s.
> >>
> >> Netperf is trasmitting a large buffer in MTU-sized packets (min 1500
> >> bytes).  Since the acks are only about 60 bytes in size, they should be
> >> around 4% of the total traffic.  Hence we would not expect to see more
> >> than 960 Mb/s.
> 
> > Don't forget the network overhead: http://sd.wareonearth.com/~phil/net/overhead/
> > Max TCP Payload data rates over ethernet:
> >  (1500-40)/(38+1500) = 94.9285 %  IPv4, minimal headers
> >  (1500-52)/(38+1500) = 94.1482 %  IPv4, TCP timestamps
> 
> Yes.  If you look further down the page, you will see that with jumbo 
> frames (which we have also tried) on Gb/s ethernet the maximum throughput 
> is:
> 
>    (9000-20-20-12)/(9000+14+4+7+1+12)*1000000000/1000000 = 990.042 Mbps
> 
> We are very far from this number -- averaging perhaps 600 or 700 Mbps.
>


That is the upper bound of performance on a standard PCI bus (32 bit).
To go higher you need PCI-X or PCI-Express. Also make sure you are really
getting 64-bit PCI, because I have seen some e1000 PCI-X boards that
are only 32bit.

^ permalink raw reply

* Re: [git patches] net driver fixes
From: Francois Romieu @ 2008-01-30 22:47 UTC (permalink / raw)
  To: Sam Ravnborg; +Cc: Jeff Garzik, David Miller, netdev, LKML
In-Reply-To: <20080130102116.GA20846@uranus.ravnborg.org>

Sam Ravnborg <sam@ravnborg.org> :
[...]
> > -static struct pci_device_id sis190_pci_tbl[] __devinitdata = {
> > +static struct pci_device_id sis190_pci_tbl[] = {
> >  	{ PCI_DEVICE(PCI_VENDOR_ID_SI, 0x0190), 0, 0, 0 },
> >  	{ PCI_DEVICE(PCI_VENDOR_ID_SI, 0x0191), 0, 0, 1 },
> >  	{ 0, },
> 
> The __devinitdata is OK, it is the following _devinitdata that had
> to be _devinitconst.

Strangely enough, removing the devinitdata from the sis190_pci_tbl
silents the error message here. Do you have an explanation ?

-- 
Ueimor

^ permalink raw reply

* RE: e1000 full-duplex TCP performance well below wire speed
From: Bruce Allen @ 2008-01-30 23:07 UTC (permalink / raw)
  To: Brandeburg, Jesse; +Cc: netdev, Carsten Aulbert, Henning Fehrmann, Bruce Allen
In-Reply-To: <36D9DB17C6DE9E40B059440DB8D95F52044F81DF@orsmsx418.amr.corp.intel.com>

Hi Jesse,

It's good to be talking directly to one of the e1000 developers and 
maintainers.  Although at this point I am starting to think that the 
issue may be TCP stack related and nothing to do with the NIC.  Am I 
correct that these are quite distinct parts of the kernel?

> The 82573L (a client NIC, regardless of the class of machine it is in)
> only has a x1 connection which does introduce some latency since the
> slot is only capable of about 2Gb/s data total, which includes overhead
> of descriptors and other transactions.  As you approach the maximum of
> the slot it gets more and more difficult to get wire speed in a
> bidirectional test.

According to the Intel datasheet, the PCI-e x1 connection is 2Gb/s in each 
direction.  So we only need to get up to 50% of peak to saturate a 
full-duplex wire-speed link.  I hope that the overhead is not a factor of 
two.

Important note: we ARE able to get full duplex wire speed (over 900 Mb/s 
simulaneously in both directions) using UDP.  The problems occur only with 
TCP connections.

>> The test was done with various mtu sizes ranging from 1500 to 9000,
>> with ethernet flow control switched on and off, and using reno and
>> cubic as a TCP congestion control.
>
> As asked in LKML thread, please post the exact netperf command used to
> start the client/server, whether or not you're using irqbalanced (aka
> irqbalance) and what cat /proc/interrupts looks like (you ARE using MSI,
> right?)

I have to wait until Carsten or Henning wake up tomorrow (now 23:38 in 
Germany).  So we'll provide this info in ~10 hours.

I assume that the interrupt load is distributed among all four cores -- 
the default affinity is 0xff, and I also assume that there is some type of 
interrupt aggregation taking place in the driver.  If the CPUs were not 
able to service the interrupts fast enough, I assume that we would also 
see loss of performance with UDP testing.

> I've recently discovered that particularly with the most recent kernels
> if you specify any socket options (-- -SX -sY) to netperf it does worse
> than if it just lets the kernel auto-tune.

I am pretty sure that no socket options were specified, but again need to 
wait until Carsten or Henning come back on-line.

>> The behavior depends on the setup. In one test we used cubic
>> congestion control, flow control off. The transfer rate in one
>> direction was above 0.9Gb/s while in the other direction it was 0.6
>> to 0.8 Gb/s. After 15-20s the rates flipped. Perhaps the two steams
>> are fighting for resources. (The performance of a full duplex stream
>> should be close to 1Gb/s in both directions.)  A graph of the
>> transfer speed as a function of time is here:
>> https://n0.aei.uni-hannover.de/networktest/node19-new20-noflow.jpg
>> Red shows transmit and green shows receive (please ignore other
>> plots):

> One other thing you can try with e1000 is disabling the dynamic
> interrupt moderation by loading the driver with
> InterruptThrottleRate=8000,8000,... (the number of commas depends on
> your number of ports) which might help in your particular benchmark.

OK.  Is 'dynamic interrupt moderation' another name for 'interrupt 
aggregation'?  Meaning that if more than one interrupt is generated in a 
given time interval, then they are replaced by a single interrupt?

> just for completeness can you post the dump of ethtool -e eth0 and lspci
> -vvv?

Yup, we'll give that info also.

Thanks again!

Cheers,
 	Bruce

^ permalink raw reply

* Re: e1000 full-duplex TCP performance well below wire speed
From: Bruce Allen @ 2008-01-30 23:15 UTC (permalink / raw)
  To: Rick Jones
  Cc: Brandeburg, Jesse, netdev, Carsten Aulbert, Henning Fehrmann,
	Bruce Allen
In-Reply-To: <47A0C5B2.1000500@hp.com>

Hi Rick,

First off, thanks for netperf. I've used it a lot and find it an extremely 
useful tool.

>> As asked in LKML thread, please post the exact netperf command used to
>> start the client/server, whether or not you're using irqbalanced (aka
>> irqbalance) and what cat /proc/interrupts looks like (you ARE using MSI,
>> right?)
>
> In particular, it would be good to know if you are doing two concurrent 
> streams, or if you are using the "burst mode" TCP_RR with large 
> request/response sizes method which then is only using one connection.

I'm not sure -- must wait for Henning and Carsten to respond tomorrow.

Cheers,
 	Bruce

^ permalink raw reply

* Re: [PATCH] net: NEWEMAC: Remove "rgmii-interface" from rgmii matching table
From: Benjamin Herrenschmidt @ 2008-01-30 23:14 UTC (permalink / raw)
  To: Stefan Roese; +Cc: linuxppc-dev, Josh Boyer, netdev
In-Reply-To: <200801300716.52463.sr@denx.de>


On Wed, 2008-01-30 at 07:16 +0100, Stefan Roese wrote:
> On Wednesday 16 January 2008, Josh Boyer wrote:
> > On Wed, 16 Jan 2008 20:53:59 +1100
> >
> > Benjamin Herrenschmidt <benh@kernel.crashing.org> wrote:
> > > On Wed, 2008-01-16 at 10:37 +0100, Stefan Roese wrote:
> > > > With the removal the the "rgmii-interface" device_type property from
> > > > the dts files, the newemac driver needs an update to only rely on
> > > > compatible property.
> > > >
> > > > Signed-off-by: Stefan Roese <sr@denx.de>
> > >
> > > I need to test if it works on CAB, can't change the DT on those. I'll
> > > let you know tomorrow.
> >
> > This should be fine on CAB.  The rgmii node has:
> >
> > compatible = "ibm,rgmii-axon", "ibm,rgmii"
> >
> > so the match should still catch on the latter.
> 
> How about this patch? Ben, if you think this is ok then we should make sure 
> that it goes in in this merge-window, since the other dts patch relies on it.

It's fine.

Ben.



^ permalink raw reply

* Re: e1000 full-duplex TCP performance well below wire speed
From: Bruce Allen @ 2008-01-30 23:23 UTC (permalink / raw)
  To: Stephen Hemminger; +Cc: Linux Kernel Mailing List, netdev
In-Reply-To: <20080130143335.7fc9ea21@deepthought>

Hi Stephen,

>>>> Indeed, we are not asking to see 1000 Mb/s.  We'd be happy to see 900
>>>> Mb/s.
>>>>
>>>> Netperf is trasmitting a large buffer in MTU-sized packets (min 1500
>>>> bytes).  Since the acks are only about 60 bytes in size, they should be
>>>> around 4% of the total traffic.  Hence we would not expect to see more
>>>> than 960 Mb/s.
>>
>>> Don't forget the network overhead: http://sd.wareonearth.com/~phil/net/overhead/
>>> Max TCP Payload data rates over ethernet:
>>>  (1500-40)/(38+1500) = 94.9285 %  IPv4, minimal headers
>>>  (1500-52)/(38+1500) = 94.1482 %  IPv4, TCP timestamps
>>
>> Yes.  If you look further down the page, you will see that with jumbo
>> frames (which we have also tried) on Gb/s ethernet the maximum throughput
>> is:
>>
>>    (9000-20-20-12)/(9000+14+4+7+1+12)*1000000000/1000000 = 990.042 Mbps
>>
>> We are very far from this number -- averaging perhaps 600 or 700 Mbps.

> That is the upper bound of performance on a standard PCI bus (32 bit).
> To go higher you need PCI-X or PCI-Express. Also make sure you are really
> getting 64-bit PCI, because I have seen some e1000 PCI-X boards that
> are only 32bit.

The motherboard NIC is in a PCI-e x1 slot.  This has a maximum speed of 
250 MB/s (2 Gb/s) in each direction.  It should be a factor of 2 more 
interface speed than is needed.

Cheers,
 	Bruce

^ permalink raw reply

* Re: [PATCH] PHYLIB: Add BCM5482 PHY support
From: Nate Case @ 2008-01-30 23:45 UTC (permalink / raw)
  To: Maciej W. Rozycki; +Cc: Andy Fleming, netdev
In-Reply-To: <Pine.LNX.4.64N.0801300936040.20317@blysk.ds.pg.gda.pl>

On Wed, 2008-01-30 at 09:51 +0000, Maciej W. Rozycki wrote:
> > +static struct phy_driver bcm5482_driver = {
> > +    .phy_id		= 0x0143bcb0,
> > +	.phy_id_mask	= 0xfffffff0,
> 
>  Please check formatting above and also I am a bit curious as to why the 
> ID is so different from the other ones -- the number is meant to be based 
> on the OUI assigned to the manufacturer.  Otherwise your addition is fine.

I'll re-submit with the formatting fixed.

I can't figure out why the ID is so different from the others, but I did
double-check it and test it on real hardware.

For what it's worth, I've found a lot of inconsistency in these ID
values. For example, the chips with ID1 == 0x0020 seem to use the wrong
set of OUI bits (22:7 instead of 21:6), while others (BCM5221) with ID1
== 0x0040 do it properly conforming to the IEEE standard.

I can't figure out how they got the ID values for the BCM5482.  If you
extract the OUI from 0x0143bcb0, you get 0x0050ef (which the *BSD guys
list as an alternate "mangled" Broadcom OUI).  The BCM5787 and BCM5755
also seem to share this same ID formula with the BCM5482.

- Nate Case <ncase@xes-inc.com>

^ permalink raw reply

* Re: Mostly revert "e1000/e1000e: Move PCI-Express device IDs over to e1000e"
From: Adrian Bunk @ 2008-01-30 23:58 UTC (permalink / raw)
  To: Linus Torvalds
  Cc: Randy Dunlap, Linux Kernel Mailing List, auke-jan.h.kok, jeff,
	David S. Miller, akpm, netdev
In-Reply-To: <alpine.LFD.1.00.0801301631400.3426@www.l.google.com>

On Wed, Jan 30, 2008 at 04:51:04PM +1100, Linus Torvalds wrote:
> 
> 
> On Tue, 29 Jan 2008, Randy Dunlap wrote:
> > 
> > Andrew was concerned about this when the driver was in -mm.
> > He asked for a patch that would set E1000E to same value as E1000
> > and I supplied that.  Auke acked it IIRC.  Other people vetoed it.  :(
> 
> Yeah, I've been discussing with Jeff and the gang.
> 
> I think we have agreed on a solution where the ID's show up in the old 
> driver if the new driver is not enabled at all.
> 
> (And as a side note: it turns out that the problem I experienced didn't 
> come from the new e1000e driver after all, so I'll be removing the 
> EXPERIMENTAL flag again).
> 
> So I'd suggest the final patch be something like this, but I'm sendign it 
> out just as an example of how we could solve this, not necessarily as a 
> final patch.
> 
> Jeff, Auke, would something like this be acceptable? It makes it very 
> obvious in the driver table which entries are for the PCIE versions that 
> would be handled by the E1000E driver if it is enabled..
> 
> Untested, but as mentioned, this is more of a "this looks maintainable and 
> like it should solve the issues" rather than anything I was planning on 
> committing now.

I don't like it:

We should aim at having exactly one driver for one card.

Your patch has effects like e.g. a kernel behaving differently when 
adding and compiling the e1000e module later compared to having it 
originally in the .config.

And fun like "The card works on my machine with the e1000 driver, why 
doesn't it work in your machine with the e1000 driver?".

And in terms of maintainability, people will disable the e1000e driver 
in their kernel for working around bugs in it instead of reporting the 
bugs. Exactly what we want to not happen.

And unless we want to keep this situation forever, we anyway have to 
remove the support for the PCI-Express adapters from the e1000 driver at 
some point in time, so why not make a clear cut now? Whatever problems 
this causes will be the same now or in a few years.

> 		Linus

cu
Adrian

-- 

       "Is there not promise of rain?" Ling Tan asked suddenly out
        of the darkness. There had been need of rain for many days.
       "Only a promise," Lao Er said.
                                       Pearl S. Buck - Dragon Seed

^ permalink raw reply

page: next (older) | prev (newer) | latest
- recent:[subjects (threaded)|topics (new)|topics (active)]

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox