Netdev List

Netdev List
 help / color / mirror / Atom feed

* Re: [PATCH] IPv6: Cleanup of net/ipv6/reassambly.c
From: Ingo Oeser @ 2006-03-12  7:17 UTC (permalink / raw)
  To: YOSHIFUJI Hideaki / 吉藤英明
  Cc: David S. Miller, netdev, linux-kernel
In-Reply-To: <200603120049.49294.ioe-lkml@rameria.de>

Hi,

On Sunday, 12. March 2006 00:49, Ingo Oeser wrote:
> From: Ingo Oeser <ioe-lkml@rameria.de>
> 
> Two minor cleanups:
> 
> 1. Using kzalloc() in fraq_alloc_queue() 
>    saves the memset() in ipv6_frag_create().
> 
> 2. Invert sense of if-statements to streamline code.
>    Inverts the comment, too.
> 

These are against net-2.6.17 of course.

I also compile tested this and my other kzalloc() changes.

Forgot to mention this yesterday...

Regards

Ingo Oeser

^ permalink raw reply

* Re: Linux v2.6.16-rc6
From: Willy Tarreau @ 2006-03-12  8:35 UTC (permalink / raw)
  To: David S. Miller
  Cc: michal.k.k.piotrowski, torvalds, linux-kernel, netdev, herbert
In-Reply-To: <20060311.183904.71244086.davem@davemloft.net>

On Sat, Mar 11, 2006 at 06:39:04PM -0800, David S. Miller wrote:
> From: "Michal Piotrowski" <michal.k.k.piotrowski@gmail.com>
> Date: Sun, 12 Mar 2006 02:51:40 +0100
> 
> > I have noticed this warnings
> > TCP: Treason uncloaked! Peer 82.113.55.2:11759/50967 shrinks window
> > 148470938:148470943. Repaired.
> > TCP: Treason uncloaked! Peer 82.113.55.2:11759/50967 shrinks window
> > 148470938:148470943. Repaired.
> > TCP: Treason uncloaked! Peer 82.113.55.2:11759/59768 shrinks window
> > 1124211698:1124211703. Repaired.
> > TCP: Treason uncloaked! Peer 82.113.55.2:11759/59768 shrinks window
> > 1124211698:1124211703. Repaired.
> > 
> > It maybe problem with ktorrent.
> 
> It is a problem with the remote TCP implementation, it is
> illegally advertising a smaller window that it previously
> did.

on 2005/10/27, Herbert Xu provided a patch merged in 2.6.14 to fix some
erroneous occurences of this message (some of them appeared with Linux
on the other side). It would be interesting to know whether the peer
above is Linux or not, because it might be possible that Herbert's fix
needs to be applied to other places ?

Here comes his patch with his interesting analysis for reference, in
case it might give ideas to anybody.

Cheers,
Willy

---
From: Herbert Xu <herbert@gondor.apana.org.au>
Date: Thu, 27 Oct 2005 08:47:46 +0000 (+1000)
Subject: [TCP]: Clear stale pred_flags when snd_wnd changes
X-Git-Tag: v2.6.14
X-Git-Url: http://kernel.org/git/?p=linux/kernel/git/torvalds/linux-2.6.git;a=commitdiff;h=2ad41065d9fe518759b695fc2640cf9c07261dd2

[TCP]: Clear stale pred_flags when snd_wnd changes

This bug is responsible for causing the infamous "Treason uncloaked"
messages that's been popping up everywhere since the printk was added.
It has usually been blamed on foreign operating systems.  However,
some of those reports implicate Linux as both systems are running
Linux or the TCP connection is going across the loopback interface.

In fact, there really is a bug in the Linux TCP header prediction code
that's been there since at least 2.1.8.  This bug was tracked down with
help from Dale Blount.

The effect of this bug ranges from harmless "Treason uncloaked"
messages to hung/aborted TCP connections.  The details of the bug
and fix is as follows.

When snd_wnd is updated, we only update pred_flags if
tcp_fast_path_check succeeds.  When it fails (for example,
when our rcvbuf is used up), we will leave pred_flags with
an out-of-date snd_wnd value.

When the out-of-date pred_flags happens to match the next incoming
packet we will again hit the fast path and use the current snd_wnd
which will be wrong.

In the case of the treason messages, it just happens that the snd_wnd
cached in pred_flags is zero while tp->snd_wnd is non-zero.  Therefore
when a zero-window packet comes in we incorrectly conclude that the
window is non-zero.

In fact if the peer continues to send us zero-window pure ACKs we
will continue making the same mistake.  It's only when the peer
transmits a zero-window packet with data attached that we get a
chance to snap out of it.  This is what triggers the treason
message at the next retransmit timeout.

Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
Signed-off-by: Arnaldo Carvalho de Melo <acme@mandriva.com>
---

--- a/net/ipv4/tcp_input.c
+++ b/net/ipv4/tcp_input.c
@@ -2239,6 +2239,7 @@ static int tcp_ack_update_window(struct 
 			/* Note, it is the only place, where
 			 * fast path is recovered for sending TCP.
 			 */
+			tp->pred_flags = 0;
 			tcp_fast_path_check(sk, tp);

 			if (nwin > tp->max_window) {

----

^ permalink raw reply

* Re: [PATCH 2/8] [I/OAT] Driver for the Intel(R) I/OAT DMA engine
From: Evgeniy Polyakov @ 2006-03-12 10:47 UTC (permalink / raw)
  To: Leech, Christopher; +Cc: linux-kernel, netdev
In-Reply-To: <E3A930D59AFC3345AEBA35189102A8A6060E15E7@orsmsx404.amr.corp.intel.com>

On Fri, Mar 10, 2006 at 06:29:46PM -0800, Leech, Christopher (christopher.leech@intel.com) wrote:
> From: Chris Leech [mailto:christopher.leech@intel.com] 
> Sent: Friday, March 10, 2006 6:29 PM
> To: 
> Subject: [PATCH 2/8] [I/OAT] Driver for the Intel(R) I/OAT DMA engine
> 
> 
> Adds a new ioatdma driver

enumerate_dma_channels() is still broken, if it can not fail add NOFAIL
gfp flag.
And you play tricky games with common_node/device_node of struct
dma_chan - one of that lists is never protected, while other is called 
under RCU and other locks (btw, why does insertion use RCU and deletion
in dma_async_device_unregister() does not?).
struct ioat_dma_chan - is it somewhere freed?

-- 
	Evgeniy Polyakov

^ permalink raw reply

* Re: [PATCH 2/8] [I/OAT] Driver for the Intel(R) I/OAT DMA engine
From: Andrew Morton @ 2006-03-12 11:04 UTC (permalink / raw)
  To: Evgeniy Polyakov; +Cc: christopher.leech, linux-kernel, netdev
In-Reply-To: <20060312104728.GA25301@2ka.mipt.ru>

Evgeniy Polyakov <johnpol@2ka.mipt.ru> wrote:
>
>  On Fri, Mar 10, 2006 at 06:29:46PM -0800, Leech, Christopher (christopher.leech@intel.com) wrote:
>  > From: Chris Leech [mailto:christopher.leech@intel.com] 
>  > Sent: Friday, March 10, 2006 6:29 PM
>  > To: 
>  > Subject: [PATCH 2/8] [I/OAT] Driver for the Intel(R) I/OAT DMA engine
>  > 
>  > 
>  > Adds a new ioatdma driver
> 
>  enumerate_dma_channels() is still broken, if it can not fail add NOFAIL
>  gfp flag.

The __GFP_NOFAIL flag is there to mark lame-and-buggy-code which doesn't
know how to handle ENOMEM.  I went through the kernel, found all the
retry-until-it-works loops and consolidated their behaviour in the page
allocator instead.

Really we should fix them all up.  Adding new users of __GFP_NOFAIL
would not be good.

^ permalink raw reply

* Re: Linux v2.6.16-rc6
From: Michal Piotrowski @ 2006-03-12 12:04 UTC (permalink / raw)
  To: David S. Miller; +Cc: torvalds, linux-kernel, netdev
In-Reply-To: <20060311.183904.71244086.davem@davemloft.net>

Hi,

On 12/03/06, David S. Miller <davem@davemloft.net> wrote:
> From: "Michal Piotrowski" <michal.k.k.piotrowski@gmail.com>
> Date: Sun, 12 Mar 2006 02:51:40 +0100
>
> > I have noticed this warnings
> > TCP: Treason uncloaked! Peer 82.113.55.2:11759/50967 shrinks window
> > 148470938:148470943. Repaired.
> > TCP: Treason uncloaked! Peer 82.113.55.2:11759/50967 shrinks window
> > 148470938:148470943. Repaired.
> > TCP: Treason uncloaked! Peer 82.113.55.2:11759/59768 shrinks window
> > 1124211698:1124211703. Repaired.
> > TCP: Treason uncloaked! Peer 82.113.55.2:11759/59768 shrinks window
> > 1124211698:1124211703. Repaired.
> >
> > It maybe problem with ktorrent.
>
> It is a problem with the remote TCP implementation, it is
> illegally advertising a smaller window that it previously
> did.
>

Thanks for explanation.

Regards,
Michal

--
Michal K. K. Piotrowski
LTG - Linux Testers Group
(http://www.stardust.webpages.pl/ltg/wiki/)

^ permalink raw reply

* Re: [PATCH] Nearly complete kzalloc cleanup for net/ipv6
From: Patrick McHardy @ 2006-03-12 13:57 UTC (permalink / raw)
  To: Ingo Oeser
  Cc: YOSHIFUJI Hideaki / 吉藤英明,
	David Miller, linux-kernel, netdev
In-Reply-To: <200603112136.43553.ioe-lkml@rameria.de>

Ingo Oeser wrote:
> From: Ingo Oeser <ioe-lkml@rameria.de>
> 
> Stupidly use kzalloc() instead of kmalloc()/memset() 
> everywhere where this is possible in net/ipv6/*.c . 
> 
> The netfilter part is NOT included, because Harald should see these, too.

Feel free to send netfilter patches to me.

^ permalink raw reply

* Re: 2.6.16-rc6-mm1
From: Benoit Boissinot @ 2006-03-12 15:55 UTC (permalink / raw)
  To: Andrew Morton; +Cc: linux-kernel, jgarzik, netdev
In-Reply-To: <20060312031036.3a382581.akpm@osdl.org>

On Sun, Mar 12, 2006 at 03:10:36AM -0800, Andrew Morton wrote:
> 
> ftp://ftp.kernel.org/pub/linux/kernel/people/akpm/patches/2.6/2.6.16-rc6/2.6.16-rc6-mm1/
> 
> 

drivers/net/tg3.c:8065: warning: type qualifiers ignored on function return type

Signed-off-by: Benoit Boissinot <benoit.boissinot@ens-lyon.fr>

Index: linux/drivers/net/tg3.c
===================================================================
--- linux.orig/drivers/net/tg3.c
+++ linux/drivers/net/tg3.c
@@ -8061,7 +8061,7 @@ static int tg3_test_link(struct tg3 *tp)
 }
 
 /* Only test the commonly used registers */
-static const int tg3_test_registers(struct tg3 *tp)
+static int tg3_test_registers(struct tg3 *tp)
 {
 	int i, is_5705;
 	u32 offset, read_mask, write_mask, val, save_val, read_val;

-- 
powered by bash/screen/(urxvt/fvwm|linux-console)/gentoo/gnu/linux OS

^ permalink raw reply

* [patch 0/4] natsemi: Aculab E1/T1 PMXc Carrier Card support
From: Mark Brown @ 2006-03-12 19:22 UTC (permalink / raw)
  To: Tim Hockin, Jeff Garzik; +Cc: netdev, linux-kernel

This patch series against the upstream branch of netdev-2.6 adds support
for these boards to the natsemi driver.  It implements some new
functionality required by the boards and enables the appropriate
settings when such a board is detected.

--
"You grabbed my hand and we fell into it, like a daydream - or a fever."

^ permalink raw reply

* [patch 1/4] natsemi: Add support for using MII port with no PHY
From: Mark Brown @ 2006-03-12 19:23 UTC (permalink / raw)
  To: Tim Hockin, Jeff Garzik; +Cc: netdev, linux-kernel
In-Reply-To: <20060312192259.929734000@mercator.sirena.org.uk>

[-- Attachment #1: natsemi-ignore-phy.patch --]
[-- Type: text/plain, Size: 6510 bytes --]

This patch provides a module option which configures the natsemi driver
to use the external MII port on the chip but ignore any PHYs that may be
attached to it.  The link state will be left as it was when the driver
started and can be configured via ethtool.  Any PHYs that are present
can be accessed via the MII ioctl()s.

This is useful for systems where the device is connected without a PHY
or where either information or actions outside the scope of the driver
are required in order to use the PHYs.

Signed-Off-By: Mark Brown <broonie@sirena.org.uk>

Index: natsemi-queue/drivers/net/natsemi.c
===================================================================
--- natsemi-queue.orig/drivers/net/natsemi.c	2006-02-25 13:38:34.000000000 +0000
+++ natsemi-queue/drivers/net/natsemi.c	2006-02-25 13:50:51.000000000 +0000
@@ -259,7 +259,7 @@ MODULE_PARM_DESC(debug, "DP8381x default
 MODULE_PARM_DESC(rx_copybreak, 
 	"DP8381x copy breakpoint for copy-only-tiny-frames");
 MODULE_PARM_DESC(options, 
-	"DP8381x: Bits 0-3: media type, bit 17: full duplex");
+	"DP8381x: Bits 0-3: media type, bit 17: full duplex, bit 18: ignore PHY");
 MODULE_PARM_DESC(full_duplex, "DP8381x full duplex setting(s) (1)");
 
 /*
@@ -690,6 +690,8 @@ struct netdev_private {
 	u32 intr_status;
 	/* Do not touch the nic registers */
 	int hands_off;
+	/* Don't pay attention to the reported link state. */
+	int ignore_phy;
 	/* external phy that is used: only valid if dev->if_port != PORT_TP */
 	int mii;
 	int phy_addr_external;
@@ -891,7 +893,19 @@ static int __devinit natsemi_probe1 (str
 	np->hands_off = 0;
 	np->intr_status = 0;
 
+	option = find_cnt < MAX_UNITS ? options[find_cnt] : 0;
+	if (dev->mem_start)
+		option = dev->mem_start;
+
+	/* Ignore the PHY status? */
+	if (option & 0x400) {
+		np->ignore_phy = 1;
+	} else {
+		np->ignore_phy = 0;
+	}
+
 	/* Initial port:
+	 * - If configured to ignore the PHY set up for external.
 	 * - If the nic was configured to use an external phy and if find_mii
 	 *   finds a phy: use external port, first phy that replies.
 	 * - Otherwise: internal port.
@@ -899,7 +913,7 @@ static int __devinit natsemi_probe1 (str
 	 * The address would be used to access a phy over the mii bus, but
 	 * the internal phy is accessed through mapped registers.
 	 */
-	if (readl(ioaddr + ChipConfig) & CfgExtPhy)
+	if (np->ignore_phy || readl(ioaddr + ChipConfig) & CfgExtPhy)
 		dev->if_port = PORT_MII;
 	else
 		dev->if_port = PORT_TP;
@@ -909,7 +923,9 @@ static int __devinit natsemi_probe1 (str
 
 	if (dev->if_port != PORT_TP) {
 		np->phy_addr_external = find_mii(dev);
-		if (np->phy_addr_external == PHY_ADDR_NONE) {
+		/* If we're ignoring the PHY it doesn't matter if we can't
+		 * find one. */
+		if (!np->ignore_phy && np->phy_addr_external == PHY_ADDR_NONE) {
 			dev->if_port = PORT_TP;
 			np->phy_addr_external = PHY_ADDR_INTERNAL;
 		}
@@ -917,10 +933,6 @@ static int __devinit natsemi_probe1 (str
 		np->phy_addr_external = PHY_ADDR_INTERNAL;
 	}
 
-	option = find_cnt < MAX_UNITS ? options[find_cnt] : 0;
-	if (dev->mem_start)
-		option = dev->mem_start;
-
 	/* The lower four bits are the media type. */
 	if (option) {
 		if (option & 0x200)
@@ -954,7 +966,10 @@ static int __devinit natsemi_probe1 (str
 	if (mtu)
 		dev->mtu = mtu;
 
-	netif_carrier_off(dev);
+	if (np->ignore_phy)
+		netif_carrier_on(dev);
+	else
+		netif_carrier_off(dev);
 
 	/* get the initial settings from hardware */
 	tmp            = mdio_read(dev, MII_BMCR);
@@ -1002,6 +1017,8 @@ static int __devinit natsemi_probe1 (str
 		printk("%02x, IRQ %d", dev->dev_addr[i], irq);
 		if (dev->if_port == PORT_TP)
 			printk(", port TP.\n");
+		else if (np->ignore_phy)
+			printk(", port MII, ignoring PHY\n");
 		else
 			printk(", port MII, phy ad %d.\n", np->phy_addr_external);
 	}
@@ -1682,42 +1699,44 @@ static void check_link(struct net_device
 {
 	struct netdev_private *np = netdev_priv(dev);
 	void __iomem * ioaddr = ns_ioaddr(dev);
-	int duplex;
+	int duplex = np->full_duplex;
 	u16 bmsr;
-       
-	/* The link status field is latched: it remains low after a temporary
-	 * link failure until it's read. We need the current link status,
-	 * thus read twice.
-	 */
-	mdio_read(dev, MII_BMSR);
-	bmsr = mdio_read(dev, MII_BMSR);
 
-	if (!(bmsr & BMSR_LSTATUS)) {
-		if (netif_carrier_ok(dev)) {
+	/* If we're not paying attention to the PHY status then don't check. */
+	if (!np->ignore_phy) {
+		/* The link status field is latched: it remains low
+		 * after a temporary link failure until it's read. We
+		 * need the current link status, thus read twice.
+		 */
+		mdio_read(dev, MII_BMSR);
+		bmsr = mdio_read(dev, MII_BMSR);
+
+		if (!(bmsr & BMSR_LSTATUS)) {
+			if (netif_carrier_ok(dev)) {
+				if (netif_msg_link(np))
+					printk(KERN_NOTICE "%s: link down.\n",
+					       dev->name);
+				netif_carrier_off(dev);
+				undo_cable_magic(dev);
+			}
+			return;
+		}
+		if (!netif_carrier_ok(dev)) {
 			if (netif_msg_link(np))
-				printk(KERN_NOTICE "%s: link down.\n",
-					dev->name);
-			netif_carrier_off(dev);
-			undo_cable_magic(dev);
+				printk(KERN_NOTICE "%s: link up.\n", dev->name);
+			netif_carrier_on(dev);
+			do_cable_magic(dev);
 		}
-		return;
-	}
-	if (!netif_carrier_ok(dev)) {
-		if (netif_msg_link(np))
-			printk(KERN_NOTICE "%s: link up.\n", dev->name);
-		netif_carrier_on(dev);
-		do_cable_magic(dev);
-	}
 
-	duplex = np->full_duplex;
-	if (!duplex) {
-		if (bmsr & BMSR_ANEGCOMPLETE) {
-			int tmp = mii_nway_result(
-				np->advertising & mdio_read(dev, MII_LPA));
-			if (tmp == LPA_100FULL || tmp == LPA_10FULL)
+		if (!duplex) {
+			if (bmsr & BMSR_ANEGCOMPLETE) {
+				int tmp = mii_nway_result(
+					np->advertising & mdio_read(dev, MII_LPA));
+				if (tmp == LPA_100FULL || tmp == LPA_10FULL)
+					duplex = 1;
+			} else if (mdio_read(dev, MII_BMCR) & BMCR_FULLDPLX)
 				duplex = 1;
-		} else if (mdio_read(dev, MII_BMCR) & BMCR_FULLDPLX)
-			duplex = 1;
+		}
 	}
 
 	/* if duplex is set then bit 28 must be set, too */
@@ -2927,6 +2946,16 @@ static int netdev_set_ecmd(struct net_de
 	}
 
 	/*
+	 * If we're ignoring the PHY then autoneg and the internal
+	 * transciever are really not going to work so don't let the
+	 * user select them.
+	 */
+	if (np->ignore_phy && (ecmd->autoneg == AUTONEG_ENABLE ||
+			       ecmd->port == PORT_INTERNAL)) {
+		return -EINVAL;
+	}
+
+	/*
 	 * maxtxpkt, maxrxpkt: ignored for now.
 	 *
 	 * transceiver:

--
"You grabbed my hand and we fell into it, like a daydream - or a fever."

^ permalink raw reply

* [patch 2/4] natsemi: Support oversized EEPROMs
From: Mark Brown @ 2006-03-12 19:23 UTC (permalink / raw)
  To: Tim Hockin, Jeff Garzik; +Cc: netdev, linux-kernel
In-Reply-To: <20060312192259.929734000@mercator.sirena.org.uk>

[-- Attachment #1: natsemi-variable-eeprom-size.patch --]
[-- Type: text/plain, Size: 2871 bytes --]

The natsemi chip can have a larger EEPROM attached than it itself uses
for configuration.  This patch adds support for user space access
to such an EEPROM.

Signed-off-by: Mark Brown <broonie@sirena.org.uk>

Index: natsemi-queue/drivers/net/natsemi.c
===================================================================
--- natsemi-queue.orig/drivers/net/natsemi.c	2006-02-25 17:40:15.000000000 +0000
+++ natsemi-queue/drivers/net/natsemi.c	2006-02-25 17:40:39.000000000 +0000
@@ -226,7 +226,7 @@ static int full_duplex[MAX_UNITS];
 				 NATSEMI_PG1_NREGS)
 #define NATSEMI_REGS_VER	1 /* v1 added RFDR registers */
 #define NATSEMI_REGS_SIZE	(NATSEMI_NREGS * sizeof(u32))
-#define NATSEMI_EEPROM_SIZE	24 /* 12 16-bit values */
+#define NATSEMI_DEF_EEPROM_SIZE	24 /* 12 16-bit values */
 
 /* Buffer sizes:
  * The nic writes 32-bit values, even if the upper bytes of
@@ -716,6 +716,8 @@ struct netdev_private {
 	unsigned int iosize;
 	spinlock_t lock;
 	u32 msg_enable;
+	/* EEPROM data */
+	int eeprom_size;
 };
 
 static void move_int_phy(struct net_device *dev, int addr);
@@ -892,6 +894,7 @@ static int __devinit natsemi_probe1 (str
 	np->msg_enable = (debug >= 0) ? (1<<debug)-1 : NATSEMI_DEF_MSG;
 	np->hands_off = 0;
 	np->intr_status = 0;
+	np->eeprom_size = NATSEMI_DEF_EEPROM_SIZE;
 
 	option = find_cnt < MAX_UNITS ? options[find_cnt] : 0;
 	if (dev->mem_start)
@@ -2601,7 +2604,8 @@ static int get_regs_len(struct net_devic
 
 static int get_eeprom_len(struct net_device *dev)
 {
-	return NATSEMI_EEPROM_SIZE;
+	struct netdev_private *np = netdev_priv(dev);
+	return np->eeprom_size;
 }
 
 static int get_settings(struct net_device *dev, struct ethtool_cmd *ecmd)
@@ -2688,15 +2692,20 @@ static u32 get_link(struct net_device *d
 static int get_eeprom(struct net_device *dev, struct ethtool_eeprom *eeprom, u8 *data)
 {
 	struct netdev_private *np = netdev_priv(dev);
-	u8 eebuf[NATSEMI_EEPROM_SIZE];
+	u8 *eebuf;
 	int res;
 
+	eebuf = kmalloc(np->eeprom_size, GFP_KERNEL);
+	if (!eebuf)
+		return -ENOMEM;
+
 	eeprom->magic = PCI_VENDOR_ID_NS | (PCI_DEVICE_ID_NS_83815<<16);
 	spin_lock_irq(&np->lock);
 	res = netdev_get_eeprom(dev, eebuf);
 	spin_unlock_irq(&np->lock);
 	if (!res)
 		memcpy(data, eebuf+eeprom->offset, eeprom->len);
+	kfree(eebuf);
 	return res;
 }
 
@@ -3062,9 +3071,10 @@ static int netdev_get_eeprom(struct net_
 	int i;
 	u16 *ebuf = (u16 *)buf;
 	void __iomem * ioaddr = ns_ioaddr(dev);
+	struct netdev_private *np = netdev_priv(dev);
 
 	/* eeprom_read reads 16 bits, and indexes by 16 bits */
-	for (i = 0; i < NATSEMI_EEPROM_SIZE/2; i++) {
+	for (i = 0; i < np->eeprom_size/2; i++) {
 		ebuf[i] = eeprom_read(ioaddr, i);
 		/* The EEPROM itself stores data bit-swapped, but eeprom_read
 		 * reads it back "sanely". So we swap it back here in order to

--
"You grabbed my hand and we fell into it, like a daydream - or a fever."

^ permalink raw reply

* [patch 3/4] Add a PCI vendor ID definition for Aculab
From: Mark Brown @ 2006-03-12 19:23 UTC (permalink / raw)
  To: Tim Hockin, Jeff Garzik; +Cc: netdev, linux-kernel
In-Reply-To: <20060312192259.929734000@mercator.sirena.org.uk>

[-- Attachment #1: pci-vendor-aculab.patch --]
[-- Type: text/plain, Size: 735 bytes --]

Add a vendor ID definition for Aculab.

Signed-Off-By: Mark Brown <broonie@sirena.org.uk>

Index: e1000-queue/include/linux/pci_ids.h
===================================================================
--- e1000-queue.orig/include/linux/pci_ids.h	2006-02-25 12:50:12.000000000 +0000
+++ e1000-queue/include/linux/pci_ids.h	2006-02-25 12:51:51.000000000 +0000
@@ -1572,6 +1572,8 @@
 #define PCI_VENDOR_ID_NVIDIA_SGS	0x12d2
 #define PCI_DEVICE_ID_NVIDIA_SGS_RIVA128 0x0018
 
+#define PCI_VENDOR_ID_ACULAB 0x12d9
+
 #define PCI_SUBVENDOR_ID_CHASE_PCIFAST		0x12E0
 #define PCI_SUBDEVICE_ID_CHASE_PCIFAST4		0x0031
 #define PCI_SUBDEVICE_ID_CHASE_PCIFAST8		0x0021

--
"You grabbed my hand and we fell into it, like a daydream - or a fever."

^ permalink raw reply

* [patch 4/4] natsemi: Add quirks for Aculab E1/T1 PMXc cPCI carrier cards
From: Mark Brown @ 2006-03-12 19:23 UTC (permalink / raw)
  To: Tim Hockin, Jeff Garzik; +Cc: netdev, linux-kernel
In-Reply-To: <20060312192259.929734000@mercator.sirena.org.uk>

[-- Attachment #1: natsemi-aculab-cpci-carrier.patch --]
[-- Type: text/plain, Size: 6563 bytes --]

Aculab E1/T1 PMXc cPCI carrier card cards present a natsemi on the cPCI
bus wired up in a non-standard fashion.  This patch provides support in
the natsemi driver for these cards by implementing a quirk mechanism and
using that to configure appropriate settings for the card: forcing 100M
full duplex, having a large EEPROM and using the MII port while ignoring
PHYs.

Signed-off-by: Mark Brown <broonie@sirena.org.uk>

Index: natsemi-queue/drivers/net/natsemi.c
===================================================================
--- natsemi-queue.orig/drivers/net/natsemi.c	2006-02-25 17:41:59.000000000 +0000
+++ natsemi-queue/drivers/net/natsemi.c	2006-03-08 21:44:12.000000000 +0000
@@ -226,7 +226,6 @@ static int full_duplex[MAX_UNITS];
 				 NATSEMI_PG1_NREGS)
 #define NATSEMI_REGS_VER	1 /* v1 added RFDR registers */
 #define NATSEMI_REGS_SIZE	(NATSEMI_NREGS * sizeof(u32))
-#define NATSEMI_DEF_EEPROM_SIZE	24 /* 12 16-bit values */
 
 /* Buffer sizes:
  * The nic writes 32-bit values, even if the upper bytes of
@@ -344,12 +343,14 @@ None characterised.
 
 
 
-enum pcistuff {
+enum natsemi_quirks {
 	PCI_USES_IO = 0x01,
 	PCI_USES_MEM = 0x02,
 	PCI_USES_MASTER = 0x04,
 	PCI_ADDR0 = 0x08,
 	PCI_ADDR1 = 0x10,
+	MEDIA_FORCE_100FD = 0x20,
+	MEDIA_IGNORE_PHY = 0x40,
 };
 
 /* MMIO operations required */
@@ -367,17 +368,21 @@ enum pcistuff {
 #define MII_FX_SEL	0x0001	/* 100BASE-FX (fiber) */
 #define MII_EN_SCRM	0x0004	/* enable scrambler (tp) */
 
- 
 /* array of board data directly indexed by pci_tbl[x].driver_data */
 static const struct {
 	const char *name;
 	unsigned long flags;
+	int quirks;
+	int eeprom_size;
 } natsemi_pci_info[] __devinitdata = {
-	{ "NatSemi DP8381[56]", PCI_IOTYPE },
+	{ "NatSemi DP8381[56]", PCI_IOTYPE, 0, 24 },
+	{ "Aculab E1/T1 PMXc cPCI carrier card", PCI_IOTYPE,
+	                          MEDIA_FORCE_100FD | MEDIA_IGNORE_PHY, 128 },
 };
 
 static struct pci_device_id natsemi_pci_tbl[] = {
-	{ PCI_VENDOR_ID_NS, PCI_DEVICE_ID_NS_83815, PCI_ANY_ID, PCI_ANY_ID, },
+	{ PCI_VENDOR_ID_NS, PCI_DEVICE_ID_NS_83815, PCI_VENDOR_ID_ACULAB, PCI_SUBDEVICE_ID_ACULAB_174, 0, 0, 1 },
+	{ PCI_VENDOR_ID_NS, PCI_DEVICE_ID_NS_83815, PCI_ANY_ID, PCI_ANY_ID, 0, 0, 0 },
 	{ 0, },
 };
 MODULE_DEVICE_TABLE(pci, natsemi_pci_tbl);
@@ -815,6 +820,39 @@ static void move_int_phy(struct net_devi
 	udelay(1);
 }
 
+static void __devinit natsemi_init_media (struct net_device *dev)
+{
+	struct netdev_private *np = netdev_priv(dev);
+	u32 tmp;
+
+	tmp            = mdio_read(dev, MII_BMCR);
+	np->speed      = (tmp & BMCR_SPEED100)? SPEED_100     : SPEED_10;
+	np->duplex     = (tmp & BMCR_FULLDPLX)? DUPLEX_FULL   : DUPLEX_HALF;
+	np->autoneg    = (tmp & BMCR_ANENABLE)? AUTONEG_ENABLE: AUTONEG_DISABLE;
+	np->advertising= mdio_read(dev, MII_ADVERTISE);
+
+	if ((np->advertising & ADVERTISE_ALL) != ADVERTISE_ALL
+	 && netif_msg_probe(np)) {
+		printk(KERN_INFO "natsemi %s: Transceiver default autonegotiation %s "
+			"10%s %s duplex.\n",
+			pci_name(np->pci_dev),
+			(mdio_read(dev, MII_BMCR) & BMCR_ANENABLE)?
+			  "enabled, advertise" : "disabled, force",
+			(np->advertising &
+			  (ADVERTISE_100FULL|ADVERTISE_100HALF))?
+			    "0" : "",
+			(np->advertising &
+			  (ADVERTISE_100FULL|ADVERTISE_10FULL))?
+			    "full" : "half");
+	}
+	if (netif_msg_probe(np))
+		printk(KERN_INFO
+			"natsemi %s: Transceiver status %#04x advertising %#04x.\n",
+			pci_name(np->pci_dev), mdio_read(dev, MII_BMSR),
+			np->advertising);
+
+}
+
 static int __devinit natsemi_probe1 (struct pci_dev *pdev,
 	const struct pci_device_id *ent)
 {
@@ -894,17 +932,21 @@ static int __devinit natsemi_probe1 (str
 	np->msg_enable = (debug >= 0) ? (1<<debug)-1 : NATSEMI_DEF_MSG;
 	np->hands_off = 0;
 	np->intr_status = 0;
-	np->eeprom_size = NATSEMI_DEF_EEPROM_SIZE;
+	np->eeprom_size = natsemi_pci_info[chip_idx].eeprom_size;
 
 	option = find_cnt < MAX_UNITS ? options[find_cnt] : 0;
 	if (dev->mem_start)
 		option = dev->mem_start;
 
 	/* Ignore the PHY status? */
-	if (option & 0x400) {
+	if (natsemi_pci_info[chip_idx].quirks & MEDIA_IGNORE_PHY) {
 		np->ignore_phy = 1;
 	} else {
-		np->ignore_phy = 0;
+		if (option & 0x400) {
+			np->ignore_phy = 1;
+		} else {
+			np->ignore_phy = 0;
+		}
 	}
 
 	/* Initial port:
@@ -936,6 +978,12 @@ static int __devinit natsemi_probe1 (str
 		np->phy_addr_external = PHY_ADDR_INTERNAL;
 	}
 
+	/* Apply any speed quirks. */
+	if (natsemi_pci_info[chip_idx].quirks & MEDIA_FORCE_100FD) {
+		np->speed = 100;
+		np->duplex = 1;
+	}
+
 	/* The lower four bits are the media type. */
 	if (option) {
 		if (option & 0x200)
@@ -974,32 +1022,10 @@ static int __devinit natsemi_probe1 (str
 	else
 		netif_carrier_off(dev);
 
-	/* get the initial settings from hardware */
-	tmp            = mdio_read(dev, MII_BMCR);
-	np->speed      = (tmp & BMCR_SPEED100)? SPEED_100     : SPEED_10;
-	np->duplex     = (tmp & BMCR_FULLDPLX)? DUPLEX_FULL   : DUPLEX_HALF;
-	np->autoneg    = (tmp & BMCR_ANENABLE)? AUTONEG_ENABLE: AUTONEG_DISABLE;
-	np->advertising= mdio_read(dev, MII_ADVERTISE);
-
-	if ((np->advertising & ADVERTISE_ALL) != ADVERTISE_ALL
-	 && netif_msg_probe(np)) {
-		printk(KERN_INFO "natsemi %s: Transceiver default autonegotiation %s "
-			"10%s %s duplex.\n",
-			pci_name(np->pci_dev),
-			(mdio_read(dev, MII_BMCR) & BMCR_ANENABLE)?
-			  "enabled, advertise" : "disabled, force",
-			(np->advertising &
-			  (ADVERTISE_100FULL|ADVERTISE_100HALF))?
-			    "0" : "",
-			(np->advertising &
-			  (ADVERTISE_100FULL|ADVERTISE_10FULL))?
-			    "full" : "half");
-	}
-	if (netif_msg_probe(np))
-		printk(KERN_INFO
-			"natsemi %s: Transceiver status %#04x advertising %#04x.\n",
-			pci_name(np->pci_dev), mdio_read(dev, MII_BMSR),
-			np->advertising);
+	/* get the initial settings from hardware if we don't have any
+ 	 * already */
+	if (!np->speed)
+		natsemi_init_media(dev);
 
 	/* save the silicon revision for later querying */
 	np->srr = readl(ioaddr + SiliconRev);
Index: natsemi-queue/include/linux/pci_ids.h
===================================================================
--- natsemi-queue.orig/include/linux/pci_ids.h	2006-02-25 19:19:55.000000000 +0000
+++ natsemi-queue/include/linux/pci_ids.h	2006-02-25 19:21:28.000000000 +0000
@@ -1573,6 +1573,7 @@
 #define PCI_DEVICE_ID_NVIDIA_SGS_RIVA128 0x0018
 
 #define PCI_VENDOR_ID_ACULAB 0x12d9
+#define PCI_SUBDEVICE_ID_ACULAB_174      0x000c
 
 #define PCI_SUBVENDOR_ID_CHASE_PCIFAST		0x12E0
 #define PCI_SUBDEVICE_ID_CHASE_PCIFAST4		0x0031

--
"You grabbed my hand and we fell into it, like a daydream - or a fever."

^ permalink raw reply

* Re: [patch 1/4] natsemi: Add support for using MII port with no PHY
From: thockin @ 2006-03-12 21:41 UTC (permalink / raw)
  To: Mark Brown; +Cc: Jeff Garzik, netdev, linux-kernel
In-Reply-To: <20060312205303.869316000@mercator.sirena.org.uk>

On Sun, Mar 12, 2006 at 07:23:00PM +0000, Mark Brown wrote:
> This patch provides a module option which configures the natsemi driver
> to use the external MII port on the chip but ignore any PHYs that may be
> attached to it.  The link state will be left as it was when the driver
> started and can be configured via ethtool.  Any PHYs that are present
> can be accessed via the MII ioctl()s.
> 
> This is useful for systems where the device is connected without a PHY
> or where either information or actions outside the scope of the driver
> are required in order to use the PHYs.

Not that my opinion should hold much weight, having been absent from the
driver for some time, but yuck.  Is there no better way to do this thatn
sprinkling poo all over it?

^ permalink raw reply

* Re: [RFC: 2.6 patch] hostap_hw.c:hfa384x_set_rid(): fix error handling
From: Jouni Malinen @ 2006-03-13  1:15 UTC (permalink / raw)
  To: Adrian Bunk; +Cc: netdev, hostap, linux-kernel, linville
In-Reply-To: <20060309230646.GI21864@stusta.de>

On Fri, Mar 10, 2006 at 12:06:46AM +0100, Adrian Bunk wrote:

> The Coverity checker noted that the call to prism2_hw_reset() was dead 
> code.
> 
> Does this patch change the code to what was intended?

Thanks! Based on my CVS history, it looks like this was broken in 2002
when the access command was moved from another function and verification
of -ETIMEDOUT value was not moved correctly. The original behavior would
be achieved by changing your patch to call printk first before the moved
prism2_hw_reset(dev) call. I added this (with the re-ordered printk) to
my queue for wireless-2.6.

-- 
Jouni Malinen                                            PGP id EFC895FA

^ permalink raw reply

* Re: [2.6 patch] hostap_ap.c:hostap_add_sta(): inconsequent NULL checking
From: Jouni Malinen @ 2006-03-13  1:23 UTC (permalink / raw)
  To: Adrian Bunk; +Cc: hostap, netdev, linux-kernel
In-Reply-To: <20060310191026.GS21864@stusta.de>

On Fri, Mar 10, 2006 at 08:10:26PM +0100, Adrian Bunk wrote:
> The Coverity checker spotted this inconsequent NULL checking 
> (unconditionally dereferencing directly after checking for NULL
> isn't a good idea).

Thanks! Added to my queue for wireless-2.6.

-- 
Jouni Malinen                                            PGP id EFC895FA

^ permalink raw reply

* Re: [PATCH] TC: bug fixes to the "sample" clause
From: Russell Stuart @ 2006-03-13  4:44 UTC (permalink / raw)
  To: hadi; +Cc: netdev, lartc
In-Reply-To: <1142082696.5184.53.camel@jzny2>

On Sat, 2006-03-11 at 08:11 -0500, jamal wrote: 
> > On my machine tc does not parse filter "sample" for the u32
> > filter.  Eg:
> > 
> > tc filter add dev eth2 parent 1:0 protocol ip prio 1 u32 ht 801: \ 
> >     classid 1:3 \
> >     sample ip protocol 1 0xff match ip protocol 1 0xff
> >   Illegal "sample"
> > 
> 
> The syntax is correct but ht 801: must exist - that is why it is being
> rejected.. So there is absolutely categorically _no way in hell_ your
> memset would have fixed it ;-> Apologies for being overdramatic ;->

You are wrong on both counts.

Firstly, the error message came from tc when it parsed
the line.  If tc gets an error talking to the kernel it 
says, among other things:
  "We have an error talking to the kernel"
Ergo, it hadn't given the command to the kernel yet.
This is significant, because the only place that
knows whether ht 801: has been created or not is
the kernel.  So obviously the error message can't
depend on whether the table had been created.

As it happens I did create the ht before issuing the
prior command when I first struck the bug - but I
didn't show it in my example because it was not
relevant.

As for _no way in hell_ - there are apparently more ways
in hell then you are aware of.  If you look at the 
function pack_key() in tc/f_u32.c, you will see it
assumes the "sel" parameter it is passed is initialised.
Without the added "memset()" it isn't - it just contains
random crap.  Of course on some machines (perhaps yours?)
that random crap might be 0, and then it would work.  
That is why I said at the start of my patch "On my 
machine tc does not ...".  On other machines the bug
may not appear.

> sample never worked 100% of the time with that hash. It works _most_ of
> the time with 256 buckets. Infact it will work some of the time as it is
> right now with 2.6.x. Can you post the output of tc -s filter ls on 2.6
> with and without your user space change?

Re: "sample never worked 100% of the time with that 
hash".  Can you give an example?  As far as I know it
always worked.

Re: "it will work some of the time as it is right now 
with 2.6.x".  Yes - it will work when you are sampling
one byte.   I am sampling port numbers, which are two
bytes.  It will not work in any case where there are
two non-zero bytes.

Re: "Can you post the output of tc -s filter ls on 2.6 
with and without your user space change?".  Here it is:

  With my change:
    tc qdisc add dev eth0 root handle 1: htb
    tc filter add dev eth0 parent 1:0 prio 1 protocol ip u32 ht 801: divisor 256
    tc filter add dev eth0 parent 1:0 protocol ip prio 1 u32 ht 801: classid 1:0 sample tcp src 0x369 0xffff match tcp src 0x0369 0xffff
    tc -s filter show dev eth0
    filter parent 1: protocol ip pref 1 u32
    filter parent 1: protocol ip pref 1 u32 fh 801: ht divisor 256
    filter parent 1: protocol ip pref 1 u32 fh 801:3:800 order 2048 key ht 801 bkt 3 flowid 1:
      match 03690000/ffff0000 at nexthdr+0
    filter parent 1: protocol ip pref 1 u32 fh 800: ht divisor 1

  On the orginal "tc" shipped with debian "sarge":
    tc qdisc add dev eth0 root handle 1: htb
    tc filter add dev eth0 parent 1:0 prio 1 protocol ip u32 ht 801: divisor 256
    tc filter add dev eth0 parent 1:0 protocol ip prio 1 u32 ht 801: classid 1:0 sample tcp src 0x369 0xffff match tcp src 0x0369 0xffff
    Illegal "sample"
  Ooops.  Looks like I can't get out of this without patching
  and compiling:
    tc qdisc add dev eth0 root handle 1: htb
    tc filter add dev eth0 parent 1:0 prio 1 protocol ip u32 ht 801: divisor 256
    tc filter add dev eth0 parent 1:0 protocol ip prio 1 u32 ht 801: classid 1:0 sample tcp src 0x369 0xffff match tcp src 0x0369 0xffff
    tc -s filter show dev eth0filter parent 1: protocol ip pref 1 u32
    filter parent 1: protocol ip pref 1 u32 fh 801: ht divisor 256
    filter parent 1: protocol ip pref 1 u32 fh 801:6a:800 order 2048 key ht 801 bkt 6a flowid 1:
      match 03690000/ffff0000 at nexthdr+0
    filter parent 1: protocol ip pref 1 u32 fh 800: ht divisor 1

Note: this also answers you request in your other email re:
"can you give me an example that doesn't work".

> Heres how you should fix this:
> Patch1) fix kernel 2.4.x to be like 2.6.x
> patch 2) fix iproute2 to have the same syntax as for 2.6 and put a big
> bold note in the code that if anyone changes that part of the code to
> look at the kernel u32_hash_fold() routine.
> no need for the utsname check.

Why do it that way?  If you want to put the 2.6 hashing
algorithm in 2.4 then go ahead - but that is a separate 
decision which is not related to the issue of making tc 
backwards compatible.  Making tc work with all versions of
the kernel is what I am doing there. As an example of why
this is a good idea, Debian ships 2.4 and 2.6 kernels, 
and one version of tc.  That tc should work with both 
kernels.

Finally, regarding which hashing algorithm is better,
your results differ from mine.  First a bit of 
background: I am trying make VOIP work over some 256/64 
and 512/128 links that carry all sorts of other traffic.
The patches you see here are a result of me trying to
make that work over a range of sites (companies and
home setups).  The end result is that it did work, better
than I expected it to.  On a 512/128 ADSL link, I can:

  a.  Saturate the incoming direction with a large wget,
  b.  Saturate the outgoing direction with a large wget,
  c.  Hit the incoming direction with a "while :; do
      wget http://www.google.com/ -O /dev/null; done".
  d.  Hit the outgoing direction with an external box
      doing the same while loop.

While all that is going on, I can sustain 2 VOIP calls
with perfect clarity on that link.  I wasn't expecting to
be able to pull that off.  To pull it off I created the
u32 filter from hell.  This long winded explanation is to
forestall the inevitable "why the hell you want want to do
that" flame when you see the script that follows.

You can find the shell script that creates the filter here:
  http://www.stuart.id.au/russell/files/tc/setup-traffic-control.sh
The script itself is not that important.  What is important
is that it is a real-life use of u32, and there are approx 
1200 hashed u32 filter lines.

So how to means how good a job the hash is doing.  The
easiest would seem to be to use a least squares fit, ie:

  Number of elements to be hashed       = E
  Number of buckets                     = B   (== 256)
  Optimal number of elements per bucket = E/B
  Hash function "goodness" metric       = M =
    sqrt(sum foreach bucket i: [ (NrOfElementsInBucket(i) - E/B)^2 ] / B)

For a good hash function M < 1.  The bigger M is the
worse the hash function is.  I wrote a python program to 
compute M for my u32 filter.  The python program can be 
found here:
  http://www.stuart.id.au/russell/files/tc/tc-ports
The dataset it operates on can be found here:
  http://www.stuart.id.au/russell/files/tc/m.py

Results:
  tcp 2.6: E=534  M=2.35
  tcp 2.4: E=534  M=0.82
  udp 2.6: E=711  M=2.91
  udp 2.4: E=711  M=0.92

As you can probably tell, I see the new hash function in 2.6
as a backward step - not an improvement.

^ permalink raw reply

* RE: Router stops routing after changing MAC Address
From: Greg Scott @ 2006-03-13 12:15 UTC (permalink / raw)
  To: Chuck Ebbert
  Cc: linux-kernel, David S. Miller, netdev, Bart Samwel, Alan Cox,
	Simon Mackinlay

On eth0 - no. My "fudged" MAC Address is based on the IP Address.  So
1.2.3.50 becomes 001.002.003.050, which turns into 00:10:02:00:30:50.
But 1.2.3 is fake - it isn't the one I really use.  The other one,
172.16.16.3 - that is a real IP Address that turns into
17:20:16:01:60:03.  And here I thought I was pretty clever - it never
dawned on me in my wildest dreams that those bits had any special
meaning!  I will do some homework about what all the bits mean and then
put together another scheme for my fudged IP Addresses and post the
results here.

- Greg



-----Original Message-----
From: Chuck Ebbert [mailto:76306.1226@compuserve.com] 
Sent: Monday, March 13, 2006 12:11 AM
To: Greg Scott
Cc: linux-kernel; David S. Miller
Subject: Re: Router stops routing after changing MAC Address

In-Reply-To:
<925A849792280C4E80C5461017A4B8A20321CC@mail733.InfraSupportEtc.com>

On Fri, 10 Mar 2006 18:33:15 -0600, Greg Scott wrote:

> How to change MAC addresses is documented well enough - and it works -

> but when I change MAC addresses, my router stops routing.  From the 
> router, I can see the systems on both sides - but the router just 
> refuses to forward packets.  Here are my little test scripts to change

> MAC Addresses.
> 
> First - ip-fudge-mac.sh
> [root@test-fw2 gregs]# more ip-fudge-mac.sh ip link set eth0 down ip 
> link set eth0 address 01:02:03:04:05:06
                            ^
 Bit zero is set, so this is a multicast address.  Is that intentional?

> ip link set eth0 up
> 
> ip link set eth1 down
> ip link set eth1 address 17:20:16:01:60:03
                            ^
 Ditto.

> ip link set eth1 up
> 
> echo "1" > /proc/sys/net/ipv4/ip_forward


--
Chuck
"Penguins don't come from next door, they come from the Antarctic!"

^ permalink raw reply

* kmalloc_node returns unaligned memory
From: Olaf Hering @ 2006-03-13 14:45 UTC (permalink / raw)
  To: linux-kernel, Jaroslav Kysela; +Cc: netdev

kmalloc_node returns unaligned pointers on powerpc, when CONFIG_DEBUG_SLAB
is enabled. This makes iptables very unhappy. It checks the alignment in
net/ipv6/netfilter/ip6_tables.c:check_entry_size_and_hooks(). 
__alignof__(struct ip6t_entry) returns 8. But returned pointers from
xt_alloc_table_info() are unaligned:

Linux version 2.6.16-rc6-git1-default-iptables-slab (olaf@pomegranate) (gcc version 4.1.0 (SUSE Linux)) #2 Mon Mar 13 15:19:45 CET 2006
...
 xt_alloc_table_info(250) modprobe(1687):c0,j4294904016 newinfo/size cfc82498/0x278
 xt_alloc_table_info(265) modprobe(1687):c0,j4294904038 entries[0] c449611c
 ip_nat_init: can't setup rules.
 sys_init_module(1960) modprobe(1687):c0,j4294904071 iptable_nat returned -22
...

Any ideas how to fix that?

^ permalink raw reply

* Re: GigE on PowerMac G5
From: Andreas Schwab @ 2006-03-13 14:49 UTC (permalink / raw)
  To: Benjamin Herrenschmidt; +Cc: netdev, linuxppc64-dev
In-Reply-To: <jer75exvpa.fsf@sykes.suse.de>

Andreas Schwab <schwab@suse.de> writes:

> Benjamin Herrenschmidt <benh@kernel.crashing.org> writes:
>
>> At this point, all I can say is... does it work in OS X ?
>
> Strange, OS X can't do it either.  Looks like I have a hardware problem.

It turned out that one of the contacts in the RJ-45 jack was twisted.
After straightening it the Gb connection is working now.

Andreas.

-- 
Andreas Schwab, SuSE Labs, schwab@suse.de
SuSE Linux Products GmbH, Maxfeldstraße 5, 90409 Nürnberg, Germany
PGP key fingerprint = 58CA 54C7 6D53 942B 1756  01D3 44D5 214B 8276 4ED5
"And now for something completely different."

^ permalink raw reply

* Re: kmalloc_node returns unaligned memory
From: Jes Sorensen @ 2006-03-13 14:49 UTC (permalink / raw)
  To: Olaf Hering; +Cc: linux-kernel, Jaroslav Kysela, netdev, Dean Nelson
In-Reply-To: <20060313144513.GA9542@suse.de>

>>>>> "Olaf" == Olaf Hering <olh@suse.de> writes:

Olaf> kmalloc_node returns unaligned pointers on powerpc, when
Olaf> CONFIG_DEBUG_SLAB is enabled. This makes iptables very
Olaf> unhappy. It checks the alignment in
Olaf> net/ipv6/netfilter/ip6_tables.c:check_entry_size_and_hooks().
Olaf> __alignof__(struct ip6t_entry) returns 8. But returned pointers
Olaf> from xt_alloc_table_info() are unaligned:

Hi Olaf,

I believe this is expected behavior ;-(

We have the same problem with the XPC driver for the SN2 which resulted
in a wrapper macro being created for it.

Some sort of SLAB_HWCACHE_ALIGN flag to Slab that was always respected
for this would be nice.

Cheers,
Jes

^ permalink raw reply

* Re: [PATCH 5/6 v2] IB: IP address based RDMA connection manager
From: Roland Dreier @ 2006-03-13 15:43 UTC (permalink / raw)
  To: Sean Hefty; +Cc: netdev, linux-kernel, openib-general
In-Reply-To: <ORSMSX401FRaqbC8wSA0000000e@orsmsx401.amr.corp.intel.com>

It seems that cma_detach_from_dev():

 > +static void cma_detach_from_dev(struct rdma_id_private *id_priv)
 > +{
 > +	list_del(&id_priv->list);
 > +	if (atomic_dec_and_test(&id_priv->cma_dev->refcount))
 > +		wake_up(&id_priv->cma_dev->wait);
 > +	id_priv->cma_dev = NULL;
 > +}

doesn't need to do atomic_dec_and_test(), because it is never dropping
the last reference to id_priv (and in fact if it was, the last line
would be a use-after-free bug).

Does it make sense to replace it with:

	static void cma_detach_from_dev(struct rdma_id_private *id_priv)
	{
		list_del(&id_priv->list);
		/*
		 * cma_detach_from_dev() will never be dropping the last
		 * reference to id_priv, so no need to test here.
		 */
		atomic_dec(&id_priv->cma_dev->refcount);
		id_priv->cma_dev = NULL;
	}

on my x86_64 build that's worth

	add/remove: 0/0 grow/shrink: 0/1 up/down: 0/-40 (-40)
	function                                     old     new   delta
	cma_detach_from_dev                          106      66     -40

 - R.

^ permalink raw reply

* Re: [UPDATED PATCH] Re: [Lse-tech] Re: [Patch 7/7] Generic netlink interface (delay accounting)
From: Balbir Singh @ 2006-03-13 16:21 UTC (permalink / raw)
  To: jamal; +Cc: Shailabh Nagar, netdev, linux-kernel, lse-tech
In-Reply-To: <1142083849.5184.69.camel@jzny2>

On Sat, Mar 11, 2006 at 08:30:49AM -0500, jamal wrote:
> On Fri, 2006-10-03 at 22:09 +0530, Balbir Singh wrote:
> > On Fri, Mar 10, 2006 at 09:53:53AM -0500, jamal wrote: 
> 
> > > On kernel->user (in the case of response to #a or async notifiation as
> > > in #b) you really dont need to specify the TG/PID since they appear in
> > > the STATS etc.
> > 
> > I see your point now. I am looking at other users of netlink like
> > rtnetlink and I see the classical usage.
> > 
> > We can implement TLV's in our code, but for the most part the data we exchange
> > between the user <-> kernel has all the TLV's listed in the enum above.
> >
> > The major differnece is the type (pid/tgid). Hence we created a structure
> > (taskstats) instead of using TLV's.
> 
> Something to remember:
> 
> 1) TLVs are essentially giving you the flexibility to send optionally
> appearing elements. It is up to the receiver (in the kernel or user
> space) to check for the presence of mandatory elements or execute things
> depending on the presence of certain TLVs. Example in your case:
> if the tgid TLV appears then the user is requesting for that TLV
> if the pid appears then they are requesting for that
> if both appear then it is the && of the two.
> You should always ignore TLVs you dont understand - to allow for forward
> compatibility.
> 
> 2)  The "T" part is essentially also encoding (semantically) what size
> the value is; the "L" part is useful for validation. So the receiver
> will always know what the size of the TLV is by definition and uses the
> L to make sure it is the right size. Reject what is of the wrong size.
> 
> cheers,
> jamal

Thanks for the clarification, I will try and adapt our genetlink to use
TLV's, I can see the benefits - we will work on this as an evolutionary change
to our code.

Warm Regards,
Balbir

^ permalink raw reply

* RE: [PATCH 5/6 v2] IB: IP address based RDMA connection manager
From: Sean Hefty @ 2006-03-13 17:11 UTC (permalink / raw)
  To: 'Roland Dreier'; +Cc: netdev, linux-kernel, openib-general
In-Reply-To: <adabqwafizj.fsf@cisco.com>

> > +static void cma_detach_from_dev(struct rdma_id_private *id_priv)
> > +{
> > +	list_del(&id_priv->list);
> > +	if (atomic_dec_and_test(&id_priv->cma_dev->refcount))
> > +		wake_up(&id_priv->cma_dev->wait);
> > +	id_priv->cma_dev = NULL;
> > +}
>
>doesn't need to do atomic_dec_and_test(), because it is never dropping
>the last reference to id_priv (and in fact if it was, the last line
>would be a use-after-free bug).

It's dropping the reference on cma_dev, as opposed to id_priv.

- Sean

^ permalink raw reply

* RE: Router stops routing after changing MAC Address
From: Greg Scott @ 2006-03-13 17:17 UTC (permalink / raw)
  To: Chuck Ebbert
  Cc: linux-kernel, David S. Miller, netdev, Bart Samwel, Alan Cox,
	Simon Mackinlay

HOT DOGGIES!!!!!!!!!!

I think Chuck found the problem.  It turns out that the OUI portion of
the MAC Address - those leftmost 6 hex digits that identify the vendor -
do also have some other special meaning built in.  Chuck, I am indebted
to you and the list.  If the second hex digit is odd, this means the
high-order bit of the OUI is set and that means it's a multicast
address.  I think I have my bits right.  Here is an excerpt from
http://www.iana.org/assignments/ethernet-numbers.  

> These addresses are physical station addresses, not multicast nor
> broadcast, so the second hex digit (reading from the left) will be
> even, not odd.

There are also other sources describing how the bits are arranged and
how we display MAC Addresses.  Google is our friend.  

Anyway, one of my fudged MAC Addresses had an odd number in that second
hex digit - and that's why the router did not route.  The solution -
just make sure my fudged MAC Addresses are real unicast MAC Addresses
and not multicast addresses.  

Here is my modified ip-fudge-mac.sh script - note that I also turned
rp_filter back on:

[root@test-fw2 gregs]# more ip-fudge-mac.sh
/sbin/ip link set eth0 down
/sbin/ip link set eth0 address 12:34:56:00:30:50
/sbin/ip link set eth0 up

/sbin/ip link set eth1 down
/sbin/ip link set eth1 address 12:34:56:01:60:03
/sbin/ip link set eth1 up

echo "1" > /proc/sys/net/ipv4/ip_forward
echo "1" > /proc/sys/net/ipv4/conf/eth0/rp_filter
echo "1" > /proc/sys/net/ipv4/conf/eth1/rp_filter

##6: eth0: <BROADCAST,MULTICAST,UP> mtu 1500 qdisc pfifo_fast qlen 1000
##    link/ether 00:10:4b:71:20:60 brd ff:ff:ff:ff:ff:ff
##7: eth1: <BROADCAST,MULTICAST,UP> mtu 1500 qdisc pfifo_fast qlen 1000
##    link/ether 00:60:97:b6:f9:4a brd ff:ff:ff:ff:ff:ff
[root@test-fw2 gregs]# 

I also learned the IEEE has an easy way for anyone to register their own
OUI.  You fill out a web form and pay $1650 and 7 days later, you're the
proud owner of your own OUI block - with 24 bits to use as you see fit.
If $1650 is too steep, you can pay $550 and buy 12 bits of MAC
Addresses.

For now, I decided to use a fudged OUI of 12-34-56 and then use the
rightmost 2 octets of the IP Address with leading zeros to fill out the
rest of the MAC Address.  I will buy some official numbers from the IEEE
later.  

It is proper to give back when given a gift from the community.  So here
is my failover-monitor.sh script in its state right now.  I will
probably do a few more tweaks before going into production.  The .conf
file referenced defines a bunch of IP Addresses and interface names
specific to this site.  

This little script starts up as a daemon at boot time and sends its
output to a log file.  It polls the heartbeat NIC every 10 seconds.  If
the other end does not respond to a ping, it checks all the other NIC
interfaces.  If no response from the other NICs either, it checks the
gateway - the router to the Internet.  If the gateway DOES respond, then
it assumes the primary role.  After assuming the primary role, it polls
the gateway every 10 seconds.  If the gatway goes offline, it takes
itself offline and assumes a backup role - polling every 10 seconds to
determine if it should take control again.  This hopefully minimizes the
probability that both members of the failover pair will try to take
control and that both will assume a backup role with nobody taking
control.    But I may have to tweak the algorithm a bit more after more
testing.  

- Greg Scott

#!/bin/bash
# failover-monitor.sh
# First find out if this node or its partner should be primary by 
# checking the flag file.  If the file exists then this node thinks
# it is supposed to be primary, so take control if its partner is 
# unreachable on all interfaces.
# If the flag file does not exist then assume a backup role. 
# Poll its partner.  If its parter is offline then take control.
# If its partner is online then sleep for a few seconds and repeat.  
#
# Greg Scott, March 8, 2006

.  /firewall-scripts/rcfirewall.conf

#
# Figure out who we are
#

if [ $(hostname) = $FW1_HOST ]
then
    ME_HOST=$FW1_HOST
    ME_HBEAT=$FW1_HBEAT
    ME_INET=$FW1_INET
    ME_INETMAC=$FW1_INETMAC
    ME_TRUSTED=$FW1_TRUSTED
    ME_TRUSTEDMAC=$FW1_TRUSTEDMAC
    YOU_HOST=$FW2_HOST
    YOU_HBEAT=$FW2_HBEAT
    YOU_INET=$FW2_INET
    YOU_TRUSTED=$FW2_TRUSTED
else
    ME_HOST=$FW2_HOST
    ME_HBEAT=$FW2_HBEAT
    ME_INET=$FW2_INET
    ME_INETMAC=$FW2_INETMAC
    ME_TRUSTED=$FW2_TRUSTED
    ME_TRUSTEDMAC=$FW2_TRUSTEDMAC
    YOU_HOST=$FW1_HOST
    YOU_HBEAT=$FW1_HBEAT
    YOU_INET=$FW1_INET
    YOU_TRUSTED=$FW1_TRUSTED
fi

function take_control {
# This function is called when the failover partner does not reply
# on the YOU_HBEAT IP Address.  
# 
# Take over the firewall IP address and special MAC address iff:
# This node, "ME", can see the Internet gateway and YOU_TRUSTED
# and INET_IP do not answer.  Remember that INET_IP is the
# IP Address of the primary firewall.  That is why we test 
# for INET_IP and not YOU_INET.  

    echo "Investigating taking control"

    #
    # Ping our partner's other interfaces and the gateway and check
    # the status codes. Status of 0 is success.  1 is no reply, 2 is
    # any other error.  See ping man pages.
    #

    echo "Checking to see if $YOU_HOST answers on its other interfaces"

    ST=0

    ping -c 1 -q -w 3 $INET_IP &> /dev/nl
    ST=$?
    # Ping INET_IP instead of YOU_INET INET_IP because INET_IP is the
    # primary IP Address.
    if [ $ST = 0 ] 
    then 
	echo "$YOU_HOST is alive on $INET_IP.  Not assuming primary
role."
	ST=$YOU_PARTONLINE
    else
	echo "$YOU_HOST does not answer on $INET_IP"
	ping -c 1 -q -w 3 $YOU_TRUSTED &> /dev/nl
	ST=$?
	if [ $ST = 0 ]
	then
	    echo "$YOU_HOST is alive on $YOU_TRUSTED.  Not assuming
primary role."
	    ST=$YOU_PARTONLINE
	else
	    echo "$YOU_HOST does not answer on $YOU_TRUSTED"
	    ping -c 1 -q -w 3 $GATEWAY_IP &> /dev/nl
	    ST=$?
	    if [ $ST != 0 ]
	    then
		echo "Gateway at $GATEWAY_IP does not answer.  Not
assuming primary role."
	    else
		echo "I see gateway $GATEWAY_IP."
		echo "$(date) $ME_HOST Assuming primary firewall role"
		assume_primary
		echo "$(date) $ME_HOST relinquished primary firewall
role."
	    fi
	fi
    fi

    return $ST
}

function assume_primary {
# Create FLAGFILE noting that this node is primary.
# Set up the IP Addresses on the INET and TRUSTED1 interfaces.
# run rc.firewall.
# Poll GATEWAY_IP periodically.  
# If it does not answer
# then reset all interfaces and firewall rules back to their 
# initial state and return.

echo "$(date) $ME_HOST assuming primary firewall role." >> $FLAGFILE

/sbin/ifdown $INET_IFACE
/sbin/ifconfig $INET_IFACE hw ether $INET_MAC
/sbin/ifconfig $INET_IFACE $INET_IP netmask $INET_NETMASK broadcast
$INET_BCAST_ADDRESS
/sbin/ifup $INET_IFACE

/sbin/ifdown $TRUSTED1_IFACE
/sbin/ifconfig $TRUSTED1_IFACE hw ether $TRUSTED1_MAC
/sbin/ifconfig $TRUSTED1_IFACE $TRUSTED1_IP netmask $TRUSTED1_NETMASK
broadcast $TRUSTED1_BCAST_ADDRESS
/sbin/ifup $TRUSTED1_IFACE

echo "Running rc.firewall"

/firewall-scripts/rc.firewall

#
# So now this node is primary and handling all firewall duties.  Poll
the 
# gateway every 10 seconds and resume a backup role if this node and the
# gateway lose touch with each other.  This is a safety mechanism to
reduce
# the odds that both nodes will try to become primary at the same time.

#

while true ; do

    # echo "$(date) sleeping 10 seconds"
    sleep 10

    # Ping the gateway and check the status code
    ping -c 1 -q -w 3 $GATEWAY_IP &> /dev/nl

    if [ $? != 0 ]
    then
	# We lost contact with the gateway so reset everything
	echo ""
	echo "$(date) The gateway at $GATEWAY_IP appears to be offline."
	# DO NOT remove_flagfile
	# because if the gateway comes back somebody has to take
control.
	reset_interfaces
	break
    fi

done

return 0
}

function reset_interfaces {

echo "Resetting $INET_IFACE to $ME_INET with MAC $ME_INETMAC"
/sbin/ifdown $INET_IFACE
/sbin/ifconfig $INET_IFACE hw ether $ME_INETMAC
/sbin/ifconfig $INET_IFACE $ME_INET netmask $INET_NETMASK broadcast
$INET_BCAST_ADDRESS
/sbin/ifup $INET_IFACE

echo "Resetting $TRUSTED1_IFACE to $ME_TRUSTED with MAC $ME_TRUSTEDMAC"
/sbin/ifdown $TRUSTED1_IFACE
/sbin/ifconfig $TRUSTED1_IFACE hw ether $ME_TRUSTEDMAC
/sbin/ifconfig $TRUSTED1_IFACE $ME_TRUSTED netmask $TRUSTED1_NETMASK
broadcast $TRUSTED1_BCAST_ADDRESS
/sbin/ifup $TRUSTED1_IFACE

echo "Resetting to initial firewall rules."
/firewall-scripts/initial_rc.firewall

return 0
}

function remove_flagfile {
echo "$(date) Removing ${FLAGFILE}"
rm -f $FLAGFILE
return 0
}

echo "$(date) starting up failover.sh on $ME_HOST"

echo "Me"
echo "ME_HOST is $ME_HOST"
echo "ME_HBEAT is $ME_HBEAT"
echo "ME_INET is $ME_INET"
echo "ME_TRUSTED is $ME_TRUSTED"

echo
echo "You"
echo "YOU_HOST is $YOU_HOST"
echo "YOU_HBEAT is $YOU_HBEAT"
echo "YOU_INET is $YOU_INET"
echo "YOU_TRUSTED is $YOU_TRUSTED"

echo

reset_interfaces

echo "Initialization complete.  Starting loop"

#
# Initialization is now complete
#

HBEAT_FLG=0

while true ; do

    # echo "$(date) sleeping 10 seconds"
    sleep 10

    if [ -f $FLAGFILE ]
    then
	echo "$FLAGFILE found; attempting to seize control regardless of
heartbeat"
	take_control
	if [ $? != 0 ]
	then
	    echo "Unable to take control; removing $FLAGFILE"
	    remove_flagfile
	fi
    fi

    #
    # Check for heartbeat
    #
    ping -c 1 -q -w 3 $YOU_HBEAT &> /dev/nl
    if [ $? != 0 ]
    then
	HBEAT_FLG=1
	echo "$(date) No heartbeat detected from $YOU_HOST"
	take_control
	continue
    else
	if [ $HBEAT_FLG != 0 ]
	then
	    HBEAT_FLG=0
	    echo "$(date) Heartbeat with $YOU_HOST restored"
	fi
    fi

done

exit 0

-----Original Message-----
From: Chuck Ebbert [mailto:76306.1226@compuserve.com]
Sent: Monday, March 13, 2006 12:11 AM
To: Greg Scott
Cc: linux-kernel; David S. Miller
Subject: Re: Router stops routing after changing MAC Address

In-Reply-To:
<925A849792280C4E80C5461017A4B8A20321CC@mail733.InfraSupportEtc.com>

On Fri, 10 Mar 2006 18:33:15 -0600, Greg Scott wrote:

> How to change MAC addresses is documented well enough - and it works -

> but when I change MAC addresses, my router stops routing.  From the 
> router, I can see the systems on both sides - but the router just 
> refuses to forward packets.  Here are my little test scripts to change

> MAC Addresses.
> 
> First - ip-fudge-mac.sh
> [root@test-fw2 gregs]# more ip-fudge-mac.sh ip link set eth0 down ip 
> link set eth0 address 01:02:03:04:05:06
                            ^
 Bit zero is set, so this is a multicast address.  Is that intentional?

> ip link set eth0 up
> 
> ip link set eth1 down
> ip link set eth1 address 17:20:16:01:60:03
                            ^
 Ditto.

> ip link set eth1 up
> 
> echo "1" > /proc/sys/net/ipv4/ip_forward

--
Chuck
"Penguins don't come from next door, they come from the Antarctic!"

^ permalink raw reply

* Re: [PATCH 5/6 v2] IB: IP address based RDMA connection manager
From: Roland Dreier @ 2006-03-13 17:26 UTC (permalink / raw)
  To: Sean Hefty; +Cc: netdev, linux-kernel, openib-general
In-Reply-To: <ORSMSX401FRaqbC8wSA0000001e@orsmsx401.amr.corp.intel.com>

    Sean> It's dropping the reference on cma_dev, as opposed to
    Sean> id_priv.

Duh, sorry.

 - R.

^ permalink raw reply

page: next (older) | prev (newer) | latest
- recent:[subjects (threaded)|topics (new)|topics (active)]

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox