* Re: [PATCH 3/3] deinline a few large functions in vlan code - v3
From: David S. Miller @ 2006-04-11 7:58 UTC (permalink / raw)
To: vda; +Cc: linux-kernel, linux-net, netdev
In-Reply-To: <200604111047.36941.vda@ilport.com.ua>
From: Denis Vlasenko <vda@ilport.com.ua>
Date: Tue, 11 Apr 2006 10:47:36 +0300
> On Tuesday 11 April 2006 10:44, Denis Vlasenko wrote:
> On Tuesday 11 April 2006 10:43, Denis Vlasenko wrote:
> > These patches exclude VLAN code from netdevice drivers
> > and from bonding module, and even remove vlan-related
> > members of struct netdevice if VLAN is not configured.
> >
> > Compile tested on allyesconfig kernel with CONFIG_8021Q=y,m,n.
>
> This one add #ifdefs around vlan_rx_* members of struct netdevice,
> and moves large inlines out-of-line.
This is not very nice, there is no way I'm applying these patches.
I think the current situation is far better than the large pile of
ifdefs these patches are adding to the tree.
Let's just leave things the way they are ok?
^ permalink raw reply
* Re: [PATCH] deinline a few large functions in vlan code v2
From: David S. Miller @ 2006-04-11 8:02 UTC (permalink / raw)
To: vda; +Cc: dave, netdev, linux-kernel, jgarzik
In-Reply-To: <200604111028.54813.vda@ilport.com.ua>
From: Denis Vlasenko <vda@ilport.com.ua>
Date: Tue, 11 Apr 2006 10:28:54 +0300
> But it saves some text (~5k total in all network drivers)
> and removes a branch on rx path on non-VLAN enabled kernels...
It removes "5K total" when you build every single networking driver
statically into the main kernel image.
Nobody does that except for build sanity testing. Folks will
enable the drivers they need for their testing into a static
kernel, or build all the net drivers they want modular.
Please resist the urge to work on things like this in a robotic
fashion. Consider whether the thing you're working on really matters
in real life, and whether the alternative you're providing is really
an overall improvement. In this case you're "improving" a case only
kernel build testers will use, and the amount of ifdef's added by
your patches uglify the code immensely. To me, that's a clear net
loss.
^ permalink raw reply
* Re: [PATCH 3/3] deinline a few large functions in vlan code - v3
From: Denis Vlasenko @ 2006-04-11 8:11 UTC (permalink / raw)
To: David S. Miller; +Cc: linux-kernel, linux-net, netdev
In-Reply-To: <20060411.005858.49205474.davem@davemloft.net>
[-- Attachment #1: Type: text/plain, Size: 403 bytes --]
On Tuesday 11 April 2006 10:58, David S. Miller wrote:
> This is not very nice, there is no way I'm applying these patches.
>
> I think the current situation is far better than the large pile of
> ifdefs these patches are adding to the tree.
>
> Let's just leave things the way they are ok?
:(
Ok, one last try. Would you like this smallish patch instead?
It takes care of those BIG inlines.
--
vda
[-- Attachment #2: 2.6.16.vlan_inline5_core-1.patch --]
[-- Type: text/x-diff, Size: 6811 bytes --]
diff -urpN linux-2.6.16.org/net/core/Makefile linux-2.6.16.vlan/net/core/Makefile
--- linux-2.6.16.org/net/core/Makefile Mon Mar 20 07:53:29 2006
+++ linux-2.6.16.vlan/net/core/Makefile Tue Apr 11 10:15:59 2006
@@ -7,7 +7,7 @@ obj-y := sock.o request_sock.o skbuff.o
obj-$(CONFIG_SYSCTL) += sysctl_net_core.o
-obj-y += dev.o ethtool.o dev_mcast.o dst.o \
+obj-y += dev.o ethtool.o dev_mcast.o dev_vlan.o dst.o \
neighbour.o rtnetlink.o utils.o link_watch.o filter.o
obj-$(CONFIG_XFRM) += flow.o
diff -urpN linux-2.6.16.org/include/linux/if_vlan.h linux-2.6.16.vlan/include/linux/if_vlan.h
--- linux-2.6.16.org/include/linux/if_vlan.h Mon Mar 20 07:53:29 2006
+++ linux-2.6.16.vlan/include/linux/if_vlan.h Tue Apr 11 10:15:59 2006
@@ -149,49 +149,9 @@ struct vlan_skb_tx_cookie {
#define vlan_tx_tag_get(__skb) (VLAN_TX_SKB_CB(__skb)->vlan_tag)
/* VLAN rx hw acceleration helper. This acts like netif_{rx,receive_skb}(). */
-static inline int __vlan_hwaccel_rx(struct sk_buff *skb,
+int __vlan_hwaccel_rx(struct sk_buff *skb,
struct vlan_group *grp,
- unsigned short vlan_tag, int polling)
-{
- struct net_device_stats *stats;
-
- skb->dev = grp->vlan_devices[vlan_tag & VLAN_VID_MASK];
- if (skb->dev == NULL) {
- dev_kfree_skb_any(skb);
-
- /* Not NET_RX_DROP, this is not being dropped
- * due to congestion.
- */
- return 0;
- }
-
- skb->dev->last_rx = jiffies;
-
- stats = vlan_dev_get_stats(skb->dev);
- stats->rx_packets++;
- stats->rx_bytes += skb->len;
-
- skb->priority = vlan_get_ingress_priority(skb->dev, vlan_tag);
- switch (skb->pkt_type) {
- case PACKET_BROADCAST:
- break;
-
- case PACKET_MULTICAST:
- stats->multicast++;
- break;
-
- case PACKET_OTHERHOST:
- /* Our lower layer thinks this is not local, let's make sure.
- * This allows the VLAN to have a different MAC than the underlying
- * device, and still route correctly.
- */
- if (!memcmp(eth_hdr(skb)->h_dest, skb->dev->dev_addr, ETH_ALEN))
- skb->pkt_type = PACKET_HOST;
- break;
- };
-
- return (polling ? netif_receive_skb(skb) : netif_rx(skb));
-}
+ unsigned short vlan_tag, int polling);
static inline int vlan_hwaccel_rx(struct sk_buff *skb,
struct vlan_group *grp,
@@ -218,43 +178,7 @@ static inline int vlan_hwaccel_receive_s
* Following the skb_unshare() example, in case of error, the calling function
* doesn't have to worry about freeing the original skb.
*/
-static inline struct sk_buff *__vlan_put_tag(struct sk_buff *skb, unsigned short tag)
-{
- struct vlan_ethhdr *veth;
-
- if (skb_headroom(skb) < VLAN_HLEN) {
- struct sk_buff *sk_tmp = skb;
- skb = skb_realloc_headroom(sk_tmp, VLAN_HLEN);
- kfree_skb(sk_tmp);
- if (!skb) {
- printk(KERN_ERR "vlan: failed to realloc headroom\n");
- return NULL;
- }
- } else {
- skb = skb_unshare(skb, GFP_ATOMIC);
- if (!skb) {
- printk(KERN_ERR "vlan: failed to unshare skbuff\n");
- return NULL;
- }
- }
-
- veth = (struct vlan_ethhdr *)skb_push(skb, VLAN_HLEN);
-
- /* Move the mac addresses to the beginning of the new header. */
- memmove(skb->data, skb->data + VLAN_HLEN, 2 * VLAN_ETH_ALEN);
-
- /* first, the ethernet type */
- veth->h_vlan_proto = __constant_htons(ETH_P_8021Q);
-
- /* now, the tag */
- veth->h_vlan_TCI = htons(tag);
-
- skb->protocol = __constant_htons(ETH_P_8021Q);
- skb->mac.raw -= VLAN_HLEN;
- skb->nh.raw -= VLAN_HLEN;
-
- return skb;
-}
+struct sk_buff *__vlan_put_tag(struct sk_buff *skb, unsigned short tag);
/**
* __vlan_hwaccel_put_tag - hardware accelerated VLAN inserting
diff -urpN linux-2.6.16.org/net/core/dev_vlan.c linux-2.6.16.vlan/net/core/dev_vlan.c
--- linux-2.6.16.org/net/core/dev_vlan.c Thu Jan 1 03:00:00 1970
+++ linux-2.6.16.vlan/net/core/dev_vlan.c Tue Apr 11 10:15:59 2006
@@ -0,0 +1,110 @@
+/* 802.1q helpers.
+ *
+ * This program is free software; you can redistribute it and/or
+ * modify it under the terms of the GNU General Public License
+ * as published by the Free Software Foundation; either version
+ * 2 of the License, or (at your option) any later version.
+ */
+
+/* #if defined(CONFIG_VLAN_8021Q) || defined (CONFIG_VLAN_8021Q_MODULE) */
+
+#include <linux/skbuff.h>
+#include <linux/if_vlan.h>
+
+/* VLAN rx hw acceleration helper. This acts like netif_{rx,receive_skb}(). */
+int __vlan_hwaccel_rx(struct sk_buff *skb,
+ struct vlan_group *grp,
+ unsigned short vlan_tag, int polling)
+{
+ struct net_device_stats *stats;
+
+ skb->dev = grp->vlan_devices[vlan_tag & VLAN_VID_MASK];
+ if (skb->dev == NULL) {
+ dev_kfree_skb_any(skb);
+
+ /* Not NET_RX_DROP, this is not being dropped
+ * due to congestion.
+ */
+ return 0;
+ }
+
+ skb->dev->last_rx = jiffies;
+
+ stats = vlan_dev_get_stats(skb->dev);
+ stats->rx_packets++;
+ stats->rx_bytes += skb->len;
+
+ skb->priority = vlan_get_ingress_priority(skb->dev, vlan_tag);
+ switch (skb->pkt_type) {
+ case PACKET_BROADCAST:
+ break;
+
+ case PACKET_MULTICAST:
+ stats->multicast++;
+ break;
+
+ case PACKET_OTHERHOST:
+ /* Our lower layer thinks this is not local, let's make sure.
+ * This allows the VLAN to have a different MAC than the underlying
+ * device, and still route correctly.
+ */
+ if (!memcmp(eth_hdr(skb)->h_dest, skb->dev->dev_addr, ETH_ALEN))
+ skb->pkt_type = PACKET_HOST;
+ break;
+ };
+
+ return (polling ? netif_receive_skb(skb) : netif_rx(skb));
+}
+EXPORT_SYMBOL(__vlan_hwaccel_rx);
+
+/**
+ * __vlan_put_tag - regular VLAN tag inserting
+ * @skb: skbuff to tag
+ * @tag: VLAN tag to insert
+ *
+ * Inserts the VLAN tag into @skb as part of the payload
+ * Returns a VLAN tagged skb. If a new skb is created, @skb is freed.
+ *
+ * Following the skb_unshare() example, in case of error, the calling function
+ * doesn't have to worry about freeing the original skb.
+ */
+struct sk_buff *__vlan_put_tag(struct sk_buff *skb, unsigned short tag)
+{
+ struct vlan_ethhdr *veth;
+
+ if (skb_headroom(skb) < VLAN_HLEN) {
+ struct sk_buff *sk_tmp = skb;
+ skb = skb_realloc_headroom(sk_tmp, VLAN_HLEN);
+ kfree_skb(sk_tmp);
+ if (!skb) {
+ printk(KERN_ERR "vlan: failed to realloc headroom\n");
+ return NULL;
+ }
+ } else {
+ skb = skb_unshare(skb, GFP_ATOMIC);
+ if (!skb) {
+ printk(KERN_ERR "vlan: failed to unshare skbuff\n");
+ return NULL;
+ }
+ }
+
+ veth = (struct vlan_ethhdr *)skb_push(skb, VLAN_HLEN);
+
+ /* Move the mac addresses to the beginning of the new header. */
+ memmove(skb->data, skb->data + VLAN_HLEN, 2 * VLAN_ETH_ALEN);
+
+ /* first, the ethernet type */
+ veth->h_vlan_proto = __constant_htons(ETH_P_8021Q);
+
+ /* now, the tag */
+ veth->h_vlan_TCI = htons(tag);
+
+ skb->protocol = __constant_htons(ETH_P_8021Q);
+ skb->mac.raw -= VLAN_HLEN;
+ skb->nh.raw -= VLAN_HLEN;
+
+ return skb;
+}
+EXPORT_SYMBOL(__vlan_put_tag);
+
+/* #endif */
^ permalink raw reply
* Re: [PATCH 3/3] deinline a few large functions in vlan code - v3
From: David S. Miller @ 2006-04-11 8:36 UTC (permalink / raw)
To: vda; +Cc: linux-kernel, linux-net, netdev
In-Reply-To: <200604111111.12554.vda@ilport.com.ua>
From: Denis Vlasenko <vda@ilport.com.ua>
Date: Tue, 11 Apr 2006 11:11:12 +0300
> Ok, one last try. Would you like this smallish patch instead?
> It takes care of those BIG inlines.
You're putting vlan stuff into a net/core/*.c file, that
is not correct.
If we're not going to do the ifdef mess, fudging around it
by putting a "VLAN" function into generic networking core
code isn't the way to try and deal with it.
Like I said, let's just leave things they way they are.
Every suggestion shown so far has been worse than what we
have now.
^ permalink raw reply
* Re: [PATCH] deinline a few large functions in vlan code v2
From: Ingo Oeser @ 2006-04-11 9:49 UTC (permalink / raw)
To: Denis Vlasenko
Cc: Dave Dillow, netdev, David S. Miller, linux-kernel, jgarzik
In-Reply-To: <200604111028.54813.vda@ilport.com.ua>
Hi Denis,
Denis Vlasenko wrote:
> +#if defined(CONFIG_VLAN_8021Q) || defined(CONFIG_VLAN_8021Q_MODULE)
> if(vlan_tx_tag_present(skb)) {
> first_txd->processFlags |=
> TYPHOON_TX_PF_INSERT_VLAN | TYPHOON_TX_PF_VLAN_PRIORITY;
> @@ -844,6 +849,7 @@ typhoon_start_tx(struct sk_buff *skb, st
> cpu_to_le32(htons(vlan_tx_tag_get(skb)) <<
> TYPHOON_TX_PF_VLAN_TAG_SHIFT);
> }
> +#endif
Wouldn't it be much easier to just do
#if defined(CONFIG_VLAN_8021Q) || defined(CONFIG_VLAN_8021Q_MODULE)
static inline int vlan_tx_tag_present(...) {
/** get VLAN tag */
}
#else
static inline int vlan_tx_tag_present(...) {return 0;}
#endif
in some header file?
Similiar in typhoon.c:
#if defined(CONFIG_VLAN_8021Q) || defined(CONFIG_VLAN_8021Q_MODULE)
static inline has_vlan_group(...) {
/* get VLAN group */
}
#else
static inline has_vlan_group(...) {return 0;}
#endif
With this and similiar changes in the drivers,
your patch might be less intrusive and thus more acceptable to maintainers.
Just let the compiler remove the extra code with constant folding and dead
code elemination. The result will be even cleaner code, I think.
What do you think?
Regards
Ingo Oeser
^ permalink raw reply
* Re: [PATCH 3/3] deinline a few large functions in vlan code - v3
From: Denis Vlasenko @ 2006-04-11 12:24 UTC (permalink / raw)
To: David S. Miller; +Cc: linux-kernel, linux-net, netdev
In-Reply-To: <20060411.013642.78129957.davem@davemloft.net>
[-- Attachment #1: Type: text/plain, Size: 845 bytes --]
On Tuesday 11 April 2006 11:36, David S. Miller wrote:
> From: Denis Vlasenko <vda@ilport.com.ua>
> Date: Tue, 11 Apr 2006 11:11:12 +0300
>
> > Ok, one last try. Would you like this smallish patch instead?
> > It takes care of those BIG inlines.
>
> You're putting vlan stuff into a net/core/*.c file, that
> is not correct.
This code is _already_ there, duplicated in every driver
of VLAN-capable NIC.
In other words:
if you have just one VLAN-capable NIC driver compiled in,
you have that code in bzImage. If you have ten drivers,
you have that code duplicated ten times.
On my current kernel, I have several dozens NIC drivers
compiled in, because I don't want to mess with modules
on initrd for network boot...
Regarding net/core/*.c. You are right.
The below patch puts this code into net/8021q/if_vlan.c.
Does it look better?
--
vda
[-- Attachment #2: 2.6.16.vlan_inline6.patch --]
[-- Type: text/x-diff, Size: 7098 bytes --]
diff -urpN linux-2.6.16.org/include/linux/if_vlan.h linux-2.6.16.vlanmini/include/linux/if_vlan.h
--- linux-2.6.16.org/include/linux/if_vlan.h Mon Mar 20 07:53:29 2006
+++ linux-2.6.16.vlanmini/include/linux/if_vlan.h Tue Apr 11 15:08:31 2006
@@ -149,49 +149,9 @@ struct vlan_skb_tx_cookie {
#define vlan_tx_tag_get(__skb) (VLAN_TX_SKB_CB(__skb)->vlan_tag)
/* VLAN rx hw acceleration helper. This acts like netif_{rx,receive_skb}(). */
-static inline int __vlan_hwaccel_rx(struct sk_buff *skb,
+int __vlan_hwaccel_rx(struct sk_buff *skb,
struct vlan_group *grp,
- unsigned short vlan_tag, int polling)
-{
- struct net_device_stats *stats;
-
- skb->dev = grp->vlan_devices[vlan_tag & VLAN_VID_MASK];
- if (skb->dev == NULL) {
- dev_kfree_skb_any(skb);
-
- /* Not NET_RX_DROP, this is not being dropped
- * due to congestion.
- */
- return 0;
- }
-
- skb->dev->last_rx = jiffies;
-
- stats = vlan_dev_get_stats(skb->dev);
- stats->rx_packets++;
- stats->rx_bytes += skb->len;
-
- skb->priority = vlan_get_ingress_priority(skb->dev, vlan_tag);
- switch (skb->pkt_type) {
- case PACKET_BROADCAST:
- break;
-
- case PACKET_MULTICAST:
- stats->multicast++;
- break;
-
- case PACKET_OTHERHOST:
- /* Our lower layer thinks this is not local, let's make sure.
- * This allows the VLAN to have a different MAC than the underlying
- * device, and still route correctly.
- */
- if (!memcmp(eth_hdr(skb)->h_dest, skb->dev->dev_addr, ETH_ALEN))
- skb->pkt_type = PACKET_HOST;
- break;
- };
-
- return (polling ? netif_receive_skb(skb) : netif_rx(skb));
-}
+ unsigned short vlan_tag, int polling);
static inline int vlan_hwaccel_rx(struct sk_buff *skb,
struct vlan_group *grp,
@@ -218,43 +178,7 @@ static inline int vlan_hwaccel_receive_s
* Following the skb_unshare() example, in case of error, the calling function
* doesn't have to worry about freeing the original skb.
*/
-static inline struct sk_buff *__vlan_put_tag(struct sk_buff *skb, unsigned short tag)
-{
- struct vlan_ethhdr *veth;
-
- if (skb_headroom(skb) < VLAN_HLEN) {
- struct sk_buff *sk_tmp = skb;
- skb = skb_realloc_headroom(sk_tmp, VLAN_HLEN);
- kfree_skb(sk_tmp);
- if (!skb) {
- printk(KERN_ERR "vlan: failed to realloc headroom\n");
- return NULL;
- }
- } else {
- skb = skb_unshare(skb, GFP_ATOMIC);
- if (!skb) {
- printk(KERN_ERR "vlan: failed to unshare skbuff\n");
- return NULL;
- }
- }
-
- veth = (struct vlan_ethhdr *)skb_push(skb, VLAN_HLEN);
-
- /* Move the mac addresses to the beginning of the new header. */
- memmove(skb->data, skb->data + VLAN_HLEN, 2 * VLAN_ETH_ALEN);
-
- /* first, the ethernet type */
- veth->h_vlan_proto = __constant_htons(ETH_P_8021Q);
-
- /* now, the tag */
- veth->h_vlan_TCI = htons(tag);
-
- skb->protocol = __constant_htons(ETH_P_8021Q);
- skb->mac.raw -= VLAN_HLEN;
- skb->nh.raw -= VLAN_HLEN;
-
- return skb;
-}
+struct sk_buff *__vlan_put_tag(struct sk_buff *skb, unsigned short tag);
/**
* __vlan_hwaccel_put_tag - hardware accelerated VLAN inserting
diff -urpN linux-2.6.16.org/net/8021q/Makefile linux-2.6.16.vlanmini/net/8021q/Makefile
--- linux-2.6.16.org/net/8021q/Makefile Mon Mar 20 07:53:29 2006
+++ linux-2.6.16.vlanmini/net/8021q/Makefile Tue Apr 11 15:08:51 2006
@@ -2,6 +2,8 @@
# Makefile for the Linux VLAN layer.
#
+obj-y += if_vlan.o
+
obj-$(CONFIG_VLAN_8021Q) += 8021q.o
8021q-objs := vlan.o vlan_dev.o
diff -urpN linux-2.6.16.org/net/8021q/if_vlan.c linux-2.6.16.vlanmini/net/8021q/if_vlan.c
--- linux-2.6.16.org/net/8021q/if_vlan.c Thu Jan 1 03:00:00 1970
+++ linux-2.6.16.vlanmini/net/8021q/if_vlan.c Tue Apr 11 15:07:09 2006
@@ -0,0 +1,106 @@
+/* 802.1q helpers.
+ *
+ * This program is free software; you can redistribute it and/or
+ * modify it under the terms of the GNU General Public License
+ * as published by the Free Software Foundation; either version
+ * 2 of the License, or (at your option) any later version.
+ */
+
+#include <linux/skbuff.h>
+#include <linux/if_vlan.h>
+
+/* VLAN rx hw acceleration helper. This acts like netif_{rx,receive_skb}(). */
+int __vlan_hwaccel_rx(struct sk_buff *skb,
+ struct vlan_group *grp,
+ unsigned short vlan_tag, int polling)
+{
+ struct net_device_stats *stats;
+
+ skb->dev = grp->vlan_devices[vlan_tag & VLAN_VID_MASK];
+ if (skb->dev == NULL) {
+ dev_kfree_skb_any(skb);
+
+ /* Not NET_RX_DROP, this is not being dropped
+ * due to congestion.
+ */
+ return 0;
+ }
+
+ skb->dev->last_rx = jiffies;
+
+ stats = vlan_dev_get_stats(skb->dev);
+ stats->rx_packets++;
+ stats->rx_bytes += skb->len;
+
+ skb->priority = vlan_get_ingress_priority(skb->dev, vlan_tag);
+ switch (skb->pkt_type) {
+ case PACKET_BROADCAST:
+ break;
+
+ case PACKET_MULTICAST:
+ stats->multicast++;
+ break;
+
+ case PACKET_OTHERHOST:
+ /* Our lower layer thinks this is not local, let's make sure.
+ * This allows the VLAN to have a different MAC than the underlying
+ * device, and still route correctly.
+ */
+ if (!memcmp(eth_hdr(skb)->h_dest, skb->dev->dev_addr, ETH_ALEN))
+ skb->pkt_type = PACKET_HOST;
+ break;
+ };
+
+ return (polling ? netif_receive_skb(skb) : netif_rx(skb));
+}
+EXPORT_SYMBOL(__vlan_hwaccel_rx);
+
+/**
+ * __vlan_put_tag - regular VLAN tag inserting
+ * @skb: skbuff to tag
+ * @tag: VLAN tag to insert
+ *
+ * Inserts the VLAN tag into @skb as part of the payload
+ * Returns a VLAN tagged skb. If a new skb is created, @skb is freed.
+ *
+ * Following the skb_unshare() example, in case of error, the calling function
+ * doesn't have to worry about freeing the original skb.
+ */
+struct sk_buff *__vlan_put_tag(struct sk_buff *skb, unsigned short tag)
+{
+ struct vlan_ethhdr *veth;
+
+ if (skb_headroom(skb) < VLAN_HLEN) {
+ struct sk_buff *sk_tmp = skb;
+ skb = skb_realloc_headroom(sk_tmp, VLAN_HLEN);
+ kfree_skb(sk_tmp);
+ if (!skb) {
+ printk(KERN_ERR "vlan: failed to realloc headroom\n");
+ return NULL;
+ }
+ } else {
+ skb = skb_unshare(skb, GFP_ATOMIC);
+ if (!skb) {
+ printk(KERN_ERR "vlan: failed to unshare skbuff\n");
+ return NULL;
+ }
+ }
+
+ veth = (struct vlan_ethhdr *)skb_push(skb, VLAN_HLEN);
+
+ /* Move the mac addresses to the beginning of the new header. */
+ memmove(skb->data, skb->data + VLAN_HLEN, 2 * VLAN_ETH_ALEN);
+
+ /* first, the ethernet type */
+ veth->h_vlan_proto = __constant_htons(ETH_P_8021Q);
+
+ /* now, the tag */
+ veth->h_vlan_TCI = htons(tag);
+
+ skb->protocol = __constant_htons(ETH_P_8021Q);
+ skb->mac.raw -= VLAN_HLEN;
+ skb->nh.raw -= VLAN_HLEN;
+
+ return skb;
+}
+EXPORT_SYMBOL(__vlan_put_tag);
diff -urpN linux-2.6.16.org/net/Makefile linux-2.6.16.vlanmini/net/Makefile
--- linux-2.6.16.org/net/Makefile Mon Mar 20 07:53:29 2006
+++ linux-2.6.16.vlanmini/net/Makefile Tue Apr 11 15:05:11 2006
@@ -41,7 +41,7 @@ obj-$(CONFIG_RXRPC) += rxrpc/
obj-$(CONFIG_ATM) += atm/
obj-$(CONFIG_DECNET) += decnet/
obj-$(CONFIG_ECONET) += econet/
-obj-$(CONFIG_VLAN_8021Q) += 8021q/
+obj-y += 8021q/
obj-$(CONFIG_IP_DCCP) += dccp/
obj-$(CONFIG_IP_SCTP) += sctp/
obj-$(CONFIG_IEEE80211) += ieee80211/
^ permalink raw reply
* Re: [PATCH] deinline a few large functions in vlan code v2
From: Ingo Oeser @ 2006-04-11 13:17 UTC (permalink / raw)
To: Denis Vlasenko
Cc: Dave Dillow, netdev, David S. Miller, linux-kernel, jgarzik,
ioe-lkml
In-Reply-To: <200604111502.52302.vda@ilport.com.ua>
Hi Denis,
Denis Vlasenko wrote:
> On Tuesday 11 April 2006 12:49, Ingo Oeser wrote:
> > #if defined(CONFIG_VLAN_8021Q) || defined(CONFIG_VLAN_8021Q_MODULE)
> > static inline has_vlan_group(...) {
> > /* get VLAN group */
> > }
> > #else
> > static inline has_vlan_group(...) {return 0;}
> > #endif
> >
> > With this and similiar changes in the drivers,
> > your patch might be less intrusive and thus more acceptable to maintainers.
> >
> > Just let the compiler remove the extra code with constant folding and dead
> > code elemination. The result will be even cleaner code, I think.
> >
> > What do you think?
I thought more of introducing functions, which just fold away
all the "if ()" blocks in normal code paths,
which you wrapped into "#if" here.
I don't think people have problems, if you #ifdef out complete functions,
linear setup code or structure members. People seem to have more problems
with #ifdef in control flow code, because there the condition is nothing else but
a compile time constant (the CONFIG_FOO value) which should be expressed as such.
Instead of
#ifdef CONFIG_FOO
if (condition) {
}
#endif
just do
#ifdef CONFIG_FOO
#define foo 1
#else
#define foo 0
#endif
if (foo && condition) {
}
Just do nothing or return a compile time constant in those inlines.
>
> Addresses of these functions are stored into netdevice members:
>
> @@ -2549,8 +2559,10 @@ typhoon_init_one(struct pci_dev *pdev, c
> dev->watchdog_timeo = TX_TIMEOUT;
> dev->get_stats = typhoon_get_stats;
> dev->set_mac_address = typhoon_set_mac_address;
> +#if defined(CONFIG_VLAN_8021Q) || defined(CONFIG_VLAN_8021Q_MODULE)
> dev->vlan_rx_register = typhoon_vlan_rx_register;
> dev->vlan_rx_kill_vid = typhoon_vlan_rx_kill_vid;
> +#endif
>
> Even empty inline would not be "optimized out to nothing" here.
No problem, just optimize out the assignment itself:
#if defined(CONFIG_VLAN_8021Q) || defined(CONFIG_VLAN_8021Q_MODULE)
static inline void typhoon_setup_vlan_hooks(struct netdev *dev) {
dev->vlan_rx_register = typhoon_vlan_rx_register;
dev->vlan_rx_kill_vid = typhoon_vlan_rx_kill_vid;
}
#else
static inline void typhoon_setup_vlan_hooks(struct netdev *dev) { (void)dev; }
#endif
The whole linux/if_vlan.h stuff misses this style of code.
There is simply no fallback code for CONFIG_VLAN_8021Q not being defined.
Regards
Ingo Oeser
^ permalink raw reply
* Re: [PATCH] deinline a few large functions in vlan code v2
From: Dave Dillow @ 2006-04-11 13:59 UTC (permalink / raw)
To: Denis Vlasenko; +Cc: netdev, David S. Miller, linux-kernel, jgarzik
In-Reply-To: <200604111028.54813.vda@ilport.com.ua>
On Tue, 2006-04-11 at 10:28 +0300, Denis Vlasenko wrote:
> > > > > block? For example, typhoon.c:
> > > > >
> > > > > spin_lock(&tp->state_lock);
> > > > > +#if defined(CONFIG_VLAN_8021Q) || defined (CONFIG_VLAN_8021Q_MODULE)
> > > > > if(tp->vlgrp != NULL && rx->rxStatus & TYPHOON_RX_VLAN)
> > > > > vlan_hwaccel_receive_skb(new_skb, tp->vlgrp,
> > > > > ntohl(rx->vlanTag) & 0xffff);
> > > > > else
> > > > > +#endif
> > > > > netif_receive_skb(new_skb);
> > > > > spin_unlock(&tp->state_lock);
> > > > >
> > > > > Same for s2io.c, chelsio/sge.c, etc...
> > > >
> > > > Very likely yes. tp->vlgrp will never be non-NULL in such situations
> > > > so it's not a correctness issue, but rather an optimization :-)
> >
> > I see this as a minor optimization, at the cost uglifying the body of a
> > function.
>
> But it saves some text (~5k total in all network drivers)
> and removes a branch on rx path on non-VLAN enabled kernels...
DaveM beat me to it, but as he said, it saves 5K only if you have all
the drivers built in or loaded. And even if it saves 200 bytes in one
module, unless that module text was already less than 200 bytes into a
page, you've saved no memory -- a 4300 byte module takes 2 pages on x86,
as does a 4100 byte module.
The only savings is in instruction cache, and it would be better to
perhaps uninline vlan_hwaccel_receive_skb() or __vlan_hwaccel_rx() and
make vlan_tx_tag_present be
#if defined(CONFIG_VLAN_8021Q) || defined (CONFIG_VLAN_8021Q_MODULE)
#define vlan_tx_tag_present(__skb) \
(VLAN_TX_SKB_CB(__skb)->magic == VLAN_TX_COOKIE_MAGIC)
#else
#define vlan_tx_tag_present(x) 0
#endif
to get the cache savings on the hot paths without the ugliness.
> > Besides, if you're going to do this, you can get rid of the
> > spin_lock functions around it to, since they only protect tp->vlgrp in
> > this instance.
>
> Done.
>
> > Please don't apply this to typhoon.c
>
> Please comment on the below patch fragment, is it ok with you?
NAK. The #ifdefs are ugly, for no significant gain.
And you introduced a race condition when you moved the spin_locks. The
test for tp->vlgrp is no longer protected.
--
Dave Dillow <dave@thedillows.org>
^ permalink raw reply
* Re: [RFC/PATCH] remove unneeded check in bcm43xx
From: Michael Buesch @ 2006-04-11 16:05 UTC (permalink / raw)
To: David S. Miller
Cc: benoit.boissinot-vYW+cPY1g1pg9hUCZPvPmw,
netdev-u79uwXL29TY76Z2rM5mHXA, bcm43xx-dev-0fE9KPoRgkgATYTw5x5z8w,
linux-kernel-u79uwXL29TY76Z2rM5mHXA,
linville-2XuSBdqkA4R54TAoqtyWWQ,
benh-XVmvHMARGAS8U2dJNN8I7kB+6BGkLq7r
In-Reply-To: <20060410.224933.39567033.davem-fT/PcQaiUtIeIZ0/mPfg9Q@public.gmane.org>
On Tuesday 11 April 2006 07:49, you wrote:
> From: Benjamin Herrenschmidt <benh-XVmvHMARGAS8U2dJNN8I7kB+6BGkLq7r@public.gmane.org>
> Date: Tue, 11 Apr 2006 11:46:12 +1000
>
> > But ppc64 hits the problem and at this point, there is nothing
> > I can do other than either implementing a split zone allocation mecanism
> > in the ppc64 architecture for the sole sake of bcm43xx (ick !) or doing
> > some trick with the iommu...
>
> I think allowing DMA mask range limiting in the IOMMU layer is going
> to set a very bad precedence, just don't do it.
>
> It's 2006, we should be way past the era of not putting the full 32
> PCI DMA address bits in devices. In this day and age it is simply
> inexscusable.
Sure, but we are in 2006 and actually _have_ at least two of those
devices, which are in usage by many people.
We can't say "vendors must fix this", because lots of devices are
out there.
It must be worked around in the kernel.
--
Greetings Michael.
^ permalink raw reply
* Re: bcm43xx symbol clash problems
From: Michael Buesch @ 2006-04-11 16:35 UTC (permalink / raw)
To: Johannes Berg
Cc: linville-2XuSBdqkA4R54TAoqtyWWQ, netdev-u79uwXL29TY76Z2rM5mHXA,
bcm43xx-dev-0fE9KPoRgkgATYTw5x5z8w
In-Reply-To: <1144737580.4888.25.camel@localhost>
On Tuesday 11 April 2006 08:39, Johannes Berg wrote:
> On Mon, 2006-04-10 at 02:16 +0200, Michael Buesch wrote:
> > Does kbuild perhaps have some magic to handle that?
> >
> > This needs to be solved soon, but I have no idea how.
>
> You want to have bcm43xx and bcm43xx-dscape conflict anyway, since
> there's no point in building both into the kernel, only one can be bound
> to a device. Yes, I know one could still bind them manually, but that's
> about as icky as building them as modules imho. And whoever really needs
> a static kernel will not need both of them.
>
> I don't know how the Kconfig there is laid out, but wouldn't it be
> possible to make them conflict on both yes?
Sure. I would probably say that both m should conflict, too.
And one y and the other m, too.
This can be done with ugly "depends" statements.
--
Greetings Michael.
^ permalink raw reply
* Re: bcm43xx symbol clash problems
From: Johannes Berg @ 2006-04-11 16:42 UTC (permalink / raw)
To: Michael Buesch
Cc: linville-2XuSBdqkA4R54TAoqtyWWQ, netdev-u79uwXL29TY76Z2rM5mHXA,
bcm43xx-dev-0fE9KPoRgkgATYTw5x5z8w
In-Reply-To: <200604111835.03948.mb-fseUSCV1ubazQB+pC5nmwQ@public.gmane.org>
[-- Attachment #1: Type: text/plain, Size: 326 bytes --]
On Tue, 2006-04-11 at 18:35 +0200, Michael Buesch wrote:
> Sure. I would probably say that both m should conflict, too.
Nah, you probably want both m for testing.
> And one y and the other m, too.
Yeah that should conflict too.
> This can be done with ugly "depends" statements.
I have no idea.
johannes
[-- Attachment #2: This is a digitally signed message part --]
[-- Type: application/pgp-signature, Size: 793 bytes --]
^ permalink raw reply
* Re: bcm43xx symbol clash problems
From: Michael Buesch @ 2006-04-11 16:59 UTC (permalink / raw)
To: Johannes Berg
Cc: bcm43xx-dev-0fE9KPoRgkgATYTw5x5z8w,
linville-2XuSBdqkA4R54TAoqtyWWQ, netdev-u79uwXL29TY76Z2rM5mHXA
In-Reply-To: <1144773747.4888.37.camel@localhost>
On Tuesday 11 April 2006 18:42, Johannes Berg wrote:
> On Tue, 2006-04-11 at 18:35 +0200, Michael Buesch wrote:
>
> > Sure. I would probably say that both m should conflict, too.
>
> Nah, you probably want both m for testing.
I am OK with this, but it fucks up module autoloading.
But that is probably OK.
--
Greetings Michael.
^ permalink raw reply
* Re: created new q_disc, inserted module, tc tells me unknown qdisc
From: Stephen Hemminger @ 2006-04-11 18:16 UTC (permalink / raw)
To: George P Nychis; +Cc: netdev, lartc
In-Reply-To: <1707.128.2.140.234.1144719699.squirrel@128.2.140.234>
On Mon, 10 Apr 2006 21:41:39 -0400 (EDT)
"George P Nychis" <gnychis@cmu.edu> wrote:
> Hi,
>
> I am trying to install a proprietary qdisc made for research, it is not publically released yet, however its been used several times so i know it works.
>
> The files included are:
> q_xcp.c:
> static int xcp_parse_opt()
> static int xcp_print_opt()
> static int xcp_print_xstats()
> struct qdisc_util xcp_util = { "NULL", "xcp" ..... };
>
> sch_xcp.c:
> static int xcp_enqueue()
> static int xcp_requeue()
> static struct sk_buff * xcp_dequeue()
> ....
> ....
> struct Qdisc_ops xcp_qdisc_ops ={ NULL,NULL,"xcp",.... };
>
> printk(KERN_INFO "XCP qdisc module loaded.\n");
> return register_qdisc(&xcp_qdisc_ops);
>
>
> So, i make everything successfully, it creates q_xcp.so and copies it to /usr/lib and sch_xcp.o which it copies to /lib/modules/... so then I "insmod sch_xcp" and i see in dmesg:
> "XCP qdisc module loaded."
>
> I then try:
> "tc qdisc add dev eth0 root xcp capacity 10Mbit limit 500" and get:
> "Unknown qdisc "xcp", hence option "capacity" is unparsable"
>
> So then I read the INSTALL further to find some sort of solution and it mentions:
> This again assumes "tc" version is 2.4.7. If your "tc" is a different
> version, download the iproute2 source code, and edit Makefile to
> point "TC_INCLUDE" to "-I..../iproute2/include -I..../iproute2/tc"
>
> So, i did that, and i recompiled the q_xcp.so:
> lanthanum-ini src-1.0.1 # make q_xcp.so
> cc -O2 -fPIC -I/var/tmp/portage/iproute2-2.6.11.20050310-r1/work/iproute2-2.6.11/include/ -I/var/tmp/portage/iproute2-2.6.11.20050310-r1/work/iproute2-2.6.11/tc_include -o q_xcp.o -c q_xcp.c
> ld -shared -o q_xcp.so q_xcp.o
> rm -f q_xcp.o
>
> But i still get the same error.... so then my very final last effort was to move q_xcp.c to my iproute2 source code tc/ directory and added this to the makefile:
> TCMODULES += q_xcp.o
>
> Then I compiled tc, and i check tc to see if the xcp qdisc functions were loaded:
> lanthanum-ini tc # nm tc | grep xcp
> 080531ec t xcp_parse_opt
> 080533e0 t xcp_print_opt
> 08053426 t xcp_print_xstats
> 08070cc0 D xcp_util
>
> And finally:
> lanthanum-ini tc # ./tc qdisc add dev ath0 root xcp capacity 54Mbit limit 500
> Unknown qdisc "xcp", hence option "capacity" is unparsable
>
>
> I have no clue :( I figured that putting the .so into /usr/lib would have been enough. Sorry for the long e-mail, I hope someone can help, and thank you for your time even if you don't know the solution but read this :)
>
> - George
>
The .so needs to go in /usr/lib/tc (assuming you are running relatively recent version
of iproute2 tools).
Read source to tc.c where it calls dlopen.
^ permalink raw reply
* Re: [RFC: 2.6 patch] the overdue removal of ip{,6}_queue
From: Harald Welte @ 2006-04-11 18:26 UTC (permalink / raw)
To: Patrick McHardy
Cc: netdev, Netfilter Development Mailinglist, Adrian Bunk, coreteam,
sds
In-Reply-To: <443BC06A.8020103@trash.net>
[-- Attachment #1: Type: text/plain, Size: 1666 bytes --]
On Tue, Apr 11, 2006 at 04:42:50PM +0200, Patrick McHardy wrote:
> > Maybe we should add a printk ('app foo is using obsolete ip_queue
> > system').
>
> Good idea, that will probably help speed it up. But I still think
> we need to give them at least another six month.
ok. I'll prepare a patch for both the printk and the update of
feature-removal-schedule.
> I think we need to do two things:
>
> - make libipq_compat a drop-in replacement that doesn't require
> recompilation
libipq is not distributed as a shared library (at least not by us), so I
don't see any purpose for doing so. Do you think anyone is going to
re-link statically linked code against the new lib
> - make it work with both ipq and nfnetlink_queue
that should be possible, though.
I still don't really think that it is worth all the effort, especially
since there is only one hand full of applications using libipq.
The backwards compatibility library is expected to perform a lot worse
thna both the old libipq as well as the new native libnetfilter_queue
due to additional data copies, etc. When I wrote it, it was more
intended as some intermediate aid, something that helps while people
migrate. We shouldn't make it too perfect, otherwise they won't migrate
at all.
--
- Harald Welte <laforge@netfilter.org> http://netfilter.org/
============================================================================
"Fragmentation is like classful addressing -- an interesting early
architectural error that shows how much experimentation was going
on while IP was being designed." -- Paul Vixie
[-- Attachment #2: Type: application/pgp-signature, Size: 189 bytes --]
^ permalink raw reply
* Re: 2.6.17 regression: Very slow net transfer from some hosts
From: Stephen Hemminger @ 2006-04-11 19:21 UTC (permalink / raw)
To: Daniel Drake; +Cc: jheffner, netdev, linux-kernel
In-Reply-To: <443C03E6.7080202@gentoo.org>
On Tue, 11 Apr 2006 20:30:46 +0100
Daniel Drake <dsd@gentoo.org> wrote:
> Hi,
>
> Since sometime after 2.6.16, some websites have been very slow to load.
> Examples include:
>
> http://zd1211.ath.cx
> http://developer.osdl.org/shemminger/blog/
> http://www.reactivated.net/weblog
>
> On a good kernel, "wget http://zd1211.ath.cx" says:
> 20:23:38 (90.44 KB/s) - `index.html' saved [20895/20895]
>
> On a bad kernel:
> 20:14:18 (327.01 B/s) - `index.html' saved [20895/20895]
>
> I reproduced this on two different internet connections (same ISP
> though). However I cannot reproduce it on my other system.
>
> git-bisect tracked it down to:
>
> 7b4f4b5ebceab67ce440a61081a69f0265e17c2a is first bad commit
> diff-tree 7b4f4b5ebceab67ce440a61081a69f0265e17c2a (from
> 2babf9daae4a3561f3264638a22ac7d0b14a6f52)
> Author: John Heffner <jheffner@psc.edu>
> Date: Sat Mar 25 01:34:07 2006 -0800
>
> [TCP]: Set default max buffers from memory pool size
>
> Indeed, reverting this patch from 2.6.17-rc1-git4 allows those sites to
> load again.
>
> Any ideas?
Get a tcpdump. There are tools to sanitize the file if you worry about
ip addresses, etc.
^ permalink raw reply
* Re: 2.6.17 regression: Very slow net transfer from some hosts
From: John Heffner @ 2006-04-11 19:23 UTC (permalink / raw)
To: Daniel Drake; +Cc: netdev, linux-kernel
In-Reply-To: <443C03E6.7080202@gentoo.org>
Daniel Drake wrote:
> Hi,
>
> Since sometime after 2.6.16, some websites have been very slow to load.
> Examples include:
>
> http://zd1211.ath.cx
> http://developer.osdl.org/shemminger/blog/
> http://www.reactivated.net/weblog
>
> On a good kernel, "wget http://zd1211.ath.cx" says:
> 20:23:38 (90.44 KB/s) - `index.html' saved [20895/20895]
>
> On a bad kernel:
> 20:14:18 (327.01 B/s) - `index.html' saved [20895/20895]
>
> I reproduced this on two different internet connections (same ISP
> though). However I cannot reproduce it on my other system.
>
> git-bisect tracked it down to:
>
> 7b4f4b5ebceab67ce440a61081a69f0265e17c2a is first bad commit
> diff-tree 7b4f4b5ebceab67ce440a61081a69f0265e17c2a (from
> 2babf9daae4a3561f3264638a22ac7d0b14a6f52)
> Author: John Heffner <jheffner@psc.edu>
> Date: Sat Mar 25 01:34:07 2006 -0800
>
> [TCP]: Set default max buffers from memory pool size
>
> Indeed, reverting this patch from 2.6.17-rc1-git4 allows those sites to
> load again.
>
> Any ideas?
I'm not seeing this behavior myself. What are the values of
/proc/sys/net/ipv4/tcp_wmem, tcp_rmem, and tcp_mem? How much memory
does this system have? (A binary tcpdump might be good, too.)
Thanks,
-John
^ permalink raw reply
* Re: created new q_disc, inserted module, tc tells me unknown qdisc
From: George P Nychis @ 2006-04-11 19:26 UTC (permalink / raw)
To: Stephen Hemminger; +Cc: netdev, lartc
In-Reply-To: <20060411111644.59573edc@localhost.localdomain>
> On Mon, 10 Apr 2006 21:41:39 -0400 (EDT) "George P Nychis"
> <gnychis@cmu.edu> wrote:
>
>> Hi,
>>
>> I am trying to install a proprietary qdisc made for research, it is not
>> publically released yet, however its been used several times so i know
>> it works.
>>
>> The files included are: q_xcp.c: static int xcp_parse_opt() static int
>> xcp_print_opt() static int xcp_print_xstats() struct qdisc_util xcp_util
>> = { "NULL", "xcp" ..... };
>>
>> sch_xcp.c: static int xcp_enqueue() static int xcp_requeue() static struct
>> sk_buff * xcp_dequeue() .... .... struct Qdisc_ops xcp_qdisc_ops ={
>> NULL,NULL,"xcp",.... };
>>
>> printk(KERN_INFO "XCP qdisc module loaded.\n"); return
>> register_qdisc(&xcp_qdisc_ops);
>>
>>
>> So, i make everything successfully, it creates q_xcp.so and copies it
>> to /usr/lib and sch_xcp.o which it copies to /lib/modules/... so then I
>> "insmod sch_xcp" and i see in dmesg: "XCP qdisc module loaded."
>>
>> I then try: "tc qdisc add dev eth0 root xcp capacity 10Mbit limit 500"
>> and get: "Unknown qdisc "xcp", hence option "capacity" is unparsable"
>>
>> So then I read the INSTALL further to find some sort of solution and it
>> mentions: This again assumes "tc" version is 2.4.7. If your "tc" is a
>> different version, download the iproute2 source code, and edit Makefile
>> to point "TC_INCLUDE" to "-I..../iproute2/include -I..../iproute2/tc"
>>
>> So, i did that, and i recompiled the q_xcp.so: lanthanum-ini src-1.0.1 #
>> make q_xcp.so cc -O2 -fPIC
>> -I/var/tmp/portage/iproute2-2.6.11.20050310-r1/work/iproute2-2.6.11/inc
>> lude/
>> -I/var/tmp/portage/iproute2-2.6.11.20050310-r1/work/iproute2-2.6.11/tc_
>> include -o q_xcp.o -c q_xcp.c ld -shared -o q_xcp.so q_xcp.o rm -f
>> q_xcp.o
>>
>> But i still get the same error.... so then my very final last effort
>> was to move q_xcp.c to my iproute2 source code tc/ directory and added
>> this to the makefile: TCMODULES += q_xcp.o
>>
>> Then I compiled tc, and i check tc to see if the xcp qdisc functions
>> were loaded: lanthanum-ini tc # nm tc | grep xcp 080531ec t xcp_parse_opt
>> 080533e0 t xcp_print_opt 08053426 t xcp_print_xstats 08070cc0 D xcp_util
>>
>>
>> And finally: lanthanum-ini tc # ./tc qdisc add dev ath0 root xcp
>> capacity 54Mbit limit 500 Unknown qdisc "xcp", hence option "capacity"
>> is unparsable
>>
>>
>> I have no clue :( I figured that putting the .so into /usr/lib would
>> have been enough. Sorry for the long e-mail, I hope someone can help,
>> and thank you for your time even if you don't know the solution but
>> read this :)
>>
>> - George
>>
>
> The .so needs to go in /usr/lib/tc (assuming you are running relatively
> recent version of iproute2 tools).
>
> Read source to tc.c where it calls dlopen.
>
>
Still didn't seem to solve the problem :\
In my tc.c i have:
snprintf(buf, sizeof(buf), "/usr/lib/tc/q_%s.so", str);
Also:
lanthanum-ini tc # ls /usr/lib/tc
experimental.dist normal.dist pareto.dist paretonormal.dist q_netem.so q_xcp.so
And finally:
lanthanum-ini tc # tc qdisc add dev ath0 root xcp capacity 54Mbit limit 500
Unknown qdisc "xcp", hence option "capacity" is unparsable
Maybe i should add debugging in tc.c and see if it sees the .so and fails to load it or something.
Any other suggestions?
Thanks for all the responses,
George
^ permalink raw reply
* 2.6.17 regression: Very slow net transfer from some hosts
From: Daniel Drake @ 2006-04-11 19:30 UTC (permalink / raw)
To: jheffner; +Cc: netdev, linux-kernel
Hi,
Since sometime after 2.6.16, some websites have been very slow to load.
Examples include:
http://zd1211.ath.cx
http://developer.osdl.org/shemminger/blog/
http://www.reactivated.net/weblog
On a good kernel, "wget http://zd1211.ath.cx" says:
20:23:38 (90.44 KB/s) - `index.html' saved [20895/20895]
On a bad kernel:
20:14:18 (327.01 B/s) - `index.html' saved [20895/20895]
I reproduced this on two different internet connections (same ISP
though). However I cannot reproduce it on my other system.
git-bisect tracked it down to:
7b4f4b5ebceab67ce440a61081a69f0265e17c2a is first bad commit
diff-tree 7b4f4b5ebceab67ce440a61081a69f0265e17c2a (from
2babf9daae4a3561f3264638a22ac7d0b14a6f52)
Author: John Heffner <jheffner@psc.edu>
Date: Sat Mar 25 01:34:07 2006 -0800
[TCP]: Set default max buffers from memory pool size
Indeed, reverting this patch from 2.6.17-rc1-git4 allows those sites to
load again.
Any ideas?
Thanks,
Daniel
^ permalink raw reply
* Re: Re: created new q_disc, inserted module, tc tells me unknown qdisc
From: George P Nychis @ 2006-04-11 19:53 UTC (permalink / raw)
To: George P Nychis; +Cc: netdev, lartc
In-Reply-To: <3568.128.2.140.234.1144783610.squirrel@128.2.140.234>
I am getting closer...
I added debugging, and noticed that it looks for:
snprintf(buf, sizeof(buf), "%s_qdisc_util", str);
However in q_xcp.c it had:
struct qdisc_util xcp_util = {
so I changed that to xcp_qdisc_util, and now i run tc:
lanthanum-ini tc # tc qdisc add dev ath0 root xcp capacity 54Mbit limit 100
Segmentation fault
This happens on this line:
q = dlsym(dlh, buf);
Since this is very hard for people to help me without the source code, i did ask the author if it has been release publically and am waiting for a response. In the meantime, it seems as though maybe instead of trying to get this to work with a newer version of tc, i should install an old version of tc that the module was original made for.
Though if anyone else has ideas let me know.
Thanks for all the help
- George
>> On Mon, 10 Apr 2006 21:41:39 -0400 (EDT) "George P Nychis"
>> <gnychis@cmu.edu> wrote:
>>
>>> Hi,
>>>
>>> I am trying to install a proprietary qdisc made for research, it is
>>> not publically released yet, however its been used several times so i
>>> know it works.
>>>
>>> The files included are: q_xcp.c: static int xcp_parse_opt() static
>>> int xcp_print_opt() static int xcp_print_xstats() struct qdisc_util
>>> xcp_util = { "NULL", "xcp" ..... };
>>>
>>> sch_xcp.c: static int xcp_enqueue() static int xcp_requeue() static
>>> struct sk_buff * xcp_dequeue() .... .... struct Qdisc_ops
>>> xcp_qdisc_ops ={ NULL,NULL,"xcp",.... };
>>>
>>> printk(KERN_INFO "XCP qdisc module loaded.\n"); return
>>> register_qdisc(&xcp_qdisc_ops);
>>>
>>>
>>> So, i make everything successfully, it creates q_xcp.so and copies it
>>> to /usr/lib and sch_xcp.o which it copies to /lib/modules/... so
>>> then I "insmod sch_xcp" and i see in dmesg: "XCP qdisc module loaded."
>>>
>>>
>>> I then try: "tc qdisc add dev eth0 root xcp capacity 10Mbit limit
>>> 500" and get: "Unknown qdisc "xcp", hence option "capacity" is
>>> unparsable"
>>>
>>> So then I read the INSTALL further to find some sort of solution and
>>> it mentions: This again assumes "tc" version is 2.4.7. If your "tc"
>>> is a different version, download the iproute2 source code, and edit
>>> Makefile to point "TC_INCLUDE" to "-I..../iproute2/include
>>> -I..../iproute2/tc"
>>>
>>> So, i did that, and i recompiled the q_xcp.so: lanthanum-ini
>>> src-1.0.1 # make q_xcp.so cc -O2 -fPIC
>>> -I/var/tmp/portage/iproute2-2.6.11.20050310-r1/work/iproute2-2.6.11/i
>>> nc lude/
>>> -I/var/tmp/portage/iproute2-2.6.11.20050310-r1/work/iproute2-2.6.11/t
>>> c_ include -o q_xcp.o -c q_xcp.c ld -shared -o q_xcp.so q_xcp.o rm -f
>>> q_xcp.o
>>>
>>> But i still get the same error.... so then my very final last effort
>>> was to move q_xcp.c to my iproute2 source code tc/ directory and
>>> added this to the makefile: TCMODULES += q_xcp.o
>>>
>>> Then I compiled tc, and i check tc to see if the xcp qdisc functions
>>> were loaded: lanthanum-ini tc # nm tc | grep xcp 080531ec t
>>> xcp_parse_opt 080533e0 t xcp_print_opt 08053426 t xcp_print_xstats
>>> 08070cc0 D xcp_util
>>>
>>>
>>> And finally: lanthanum-ini tc # ./tc qdisc add dev ath0 root xcp
>>> capacity 54Mbit limit 500 Unknown qdisc "xcp", hence option
>>> "capacity" is unparsable
>>>
>>>
>>> I have no clue :( I figured that putting the .so into /usr/lib would
>>> have been enough. Sorry for the long e-mail, I hope someone can
>>> help, and thank you for your time even if you don't know the solution
>>> but read this :)
>>>
>>> - George
>>>
>>
>> The .so needs to go in /usr/lib/tc (assuming you are running relatively
>> recent version of iproute2 tools).
>>
>> Read source to tc.c where it calls dlopen.
>>
>>
>
> Still didn't seem to solve the problem :\
>
> In my tc.c i have: snprintf(buf, sizeof(buf), "/usr/lib/tc/q_%s.so", str);
>
>
> Also: lanthanum-ini tc # ls /usr/lib/tc experimental.dist normal.dist
> pareto.dist paretonormal.dist q_netem.so q_xcp.so
>
> And finally: lanthanum-ini tc # tc qdisc add dev ath0 root xcp capacity
> 54Mbit limit 500 Unknown qdisc "xcp", hence option "capacity" is
> unparsable
>
> Maybe i should add debugging in tc.c and see if it sees the .so and fails
> to load it or something.
>
> Any other suggestions?
>
> Thanks for all the responses, George
>
> _______________________________________________ LARTC mailing list
> LARTC@mailman.ds9a.nl
> http://mailman.ds9a.nl/cgi-bin/mailman/listinfo/lartc
>
>
--
^ permalink raw reply
* Re: 2.6.17 regression: Very slow net transfer from some hosts
From: John Heffner @ 2006-04-11 19:55 UTC (permalink / raw)
To: Daniel Drake; +Cc: netdev, linux-kernel
In-Reply-To: <443C0B74.50305@gentoo.org>
Daniel Drake wrote:
> John Heffner wrote:
>> I'm not seeing this behavior myself. What are the values of
>> /proc/sys/net/ipv4/tcp_wmem, tcp_rmem, and tcp_mem? How much memory
>> does this system have? (A binary tcpdump might be good, too.)
>
> tcp_wmem: 4096 16384 131072
> tcp_rmem: 4096 87380 174760
> tcp_mem: 98304 131072 196608
These are (I assume) with the patch reversed. What are the values with
the patch applied?
Thanks,
-John
^ permalink raw reply
* Re: 2.6.17 regression: Very slow net transfer from some hosts
From: Daniel Drake @ 2006-04-11 20:03 UTC (permalink / raw)
To: John Heffner; +Cc: netdev, linux-kernel
In-Reply-To: <443C024C.2070107@psc.edu>
John Heffner wrote:
> I'm not seeing this behavior myself. What are the values of
> /proc/sys/net/ipv4/tcp_wmem, tcp_rmem, and tcp_mem? How much memory
> does this system have? (A binary tcpdump might be good, too.)
tcp_wmem: 4096 16384 131072
tcp_rmem: 4096 87380 174760
tcp_mem: 98304 131072 196608
tcpdumps coming later. This is on an x86_64 system with 1GB RAM. I
connect via the forcedeth driver but also reproduced this through ipw2200.
Thanks!
Daniel
^ permalink raw reply
* Re: Re: created new q_disc, inserted module, tc tells me unknown qdisc
From: Tim Shepard @ 2006-04-11 20:29 UTC (permalink / raw)
To: George P Nychis; +Cc: netdev, lartc
In-Reply-To: <3568.128.2.140.234.1144783610.squirrel@128.2.140.234>
> Maybe i should add debugging in tc.c and see if it sees the .so and fails to load it or something.
Yes. I would do that next.
Compile tc with -g and then run it under gdb and step through that
part to see what happens.
-Tim Shepard
shep@alum.mit.edu
^ permalink raw reply
* Re: [RFC/PATCH] remove unneeded check in bcm43xx
From: Benjamin Herrenschmidt @ 2006-04-11 20:49 UTC (permalink / raw)
To: David S. Miller
Cc: benoit.boissinot, mb, netdev, bcm43xx-dev, linux-kernel, linville
In-Reply-To: <20060410.224933.39567033.davem@davemloft.net>
> I think allowing DMA mask range limiting in the IOMMU layer is going
> to set a very bad precedence, just don't do it.
>
> It's 2006, we should be way past the era of not putting the full 32
> PCI DMA address bits in devices. In this day and age it is simply
> inexscusable.
>
> Maybe we could understand chips coming out 8 years ago when a lot of
> designs were transitioning from ISA to PCI, but that no longer applies
> in any way today.
I would tend to agree... except that the broadcom is _the_ wireless card
shipped by Apple with all of their machines for the last few years, and
thus, the problem will be hit by pretty much any G5 user trying to use
theirs...
I don't have another idea on how to fix that at hand... a dma mask limit
in the iommu layer is fairly easy to implement with our iommu
implementation (though it wouldn't work on pseries where ranges are
allocated per slot, but it would work fine on a g5). Still sounds better
than introducing a ZONE_DMA separate from ZONE_NORMAL ...
Ben.
^ permalink raw reply
* Re: 2.6.17 regression: Very slow net transfer from some hosts
From: Daniel Drake @ 2006-04-11 20:53 UTC (permalink / raw)
To: John Heffner; +Cc: netdev, linux-kernel
In-Reply-To: <443C09A7.2040900@psc.edu>
John Heffner wrote:
>> tcp_wmem: 4096 16384 131072
>> tcp_rmem: 4096 87380 174760
>> tcp_mem: 98304 131072 196608
>
> These are (I assume) with the patch reversed. What are the values with
> the patch applied?
Yes- that was on a good kernel, with the patch reversed.
On a bad kernel, with the patch applied (2.6.16-git16):
tcp_wmem: 4096 16384 4194304
tcp_rmem: 4096 87380 4194304
tcp_mem: 98304 131072 196608
They seem to be identical, which makes sense, since most websites work
just fine.
I am sending tcpdump's privately to you and Stephen. If anyone else
wants to see them, just ask.
Daniel
^ permalink raw reply
* Re: 2.6.17 regression: Very slow net transfer from some hosts
From: John Heffner @ 2006-04-11 20:54 UTC (permalink / raw)
To: Daniel Drake; +Cc: netdev, linux-kernel
In-Reply-To: <443C1738.20605@gentoo.org>
This is almost certainly due to a buggy firewall that doesn't understand
TCP window scaling. I've usually seen this in the past with OpenBSD
firewalls. Do you have one of these in your path?
Thanks,
-John
^ permalink raw reply
page: next (older) | prev (newer) | latest
- recent:[subjects (threaded)|topics (new)|topics (active)]
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox