Netdev List
 help / color / mirror / Atom feed
* Re: TSO and IPoIB performance degradation
From: Michael S. Tsirkin @ 2006-03-20 13:35 UTC (permalink / raw)
  To: Arjan van de Ven
  Cc: netdev, rdreier, rick.jones2, linux-kernel, openib-general,
	David S. Miller, Lennert Buytenhek
In-Reply-To: <1142855615.3114.33.camel@laptopd505.fenrus.org>

Quoting Arjan van de Ven <arjan@infradead.org>:
> > I read it as if he was proposing to have a sysctl knob to turn off
> > TCP congestion control completely (which has so many issues it's not
> > even funny.)
> 
> owww that's so bad I didn't even consider that

No, I think that comment was taken out of thread context. We were talking about
stretching ACKs - while avoiding stretch ACKs is important for TCP congestion
control, it's not the only mechanism.

-- 
Michael S. Tsirkin
Staff Engineer, Mellanox Technologies

^ permalink raw reply

* Re: TSO and IPoIB performance degradation
From: Benjamin LaHaise @ 2006-03-20 15:09 UTC (permalink / raw)
  To: Michael S. Tsirkin
  Cc: netdev, rdreier, rick.jones2, linux-kernel, openib-general,
	David S. Miller, Lennert Buytenhek, Arjan van de Ven
In-Reply-To: <20060320120407.GY29929@mellanox.co.il>

On Mon, Mar 20, 2006 at 02:04:07PM +0200, Michael S. Tsirkin wrote:
> does not stretch ACKs anymore. RFC 2581 does mention that it might be OK to
> stretch ACKs "after careful consideration", and we are seeing that it helps
> IP over InfiniBand, so recent Linux kernels perform worse in that respect.
> 
> And since there does not seem to be a way to figure it out automagically when
> doing this is a good idea, I proposed adding some kind of knob that will let the
> user apply the consideration for us.

Wouldn't it make sense to strech the ACK when the previous ACK is still in 
the TX queue of the device?  I know that sort of behaviour was always an 
issue on modem links where you don't want to send out redundant ACKs.

		-ben
-- 
"Time is of no importance, Mr. President, only life is important."
Don't Email: <dont@kvack.org>.

^ permalink raw reply

* Re: TSO and IPoIB performance degradation
From: Rick Jones @ 2006-03-20 18:58 UTC (permalink / raw)
  To: Benjamin LaHaise
  Cc: netdev, rdreier, linux-kernel, openib-general, David S. Miller,
	Lennert Buytenhek, Arjan van de Ven
In-Reply-To: <20060320150941.GD16108@kvack.org>

> Wouldn't it make sense to strech the ACK when the previous ACK is still in 
> the TX queue of the device?  I know that sort of behaviour was always an 
> issue on modem links where you don't want to send out redundant ACKs.

Perhaps, but it isn't clear that it would be worth the cycles to check. 
    I doubt that a simple reference count on the ACK skb would do it 
since if it were a bare ACK I doubt that TCP keeps a reference to the 
skb in the first place?

Also, what would be the "trigger" to send the next ACK after the 
previous one had left the building (Elvis-like)?  Receipt of N in-order 
segments?  A timeout?

If you are going to go ahead and try to do stretch-ACKs, then I suspect 
the way to go about doing it is to have it behave very much like HP-UX 
or Solaris, both of which have arguably reasonable ACK-avoidance 
heuristics in them.

But don't try to do it quick and dirty.

rick "likes ACK avoidance, just check the archives" jones
on netdev, no need to cc me directly

^ permalink raw reply

* Re: [PATCH] scm: fold __scm_send() into scm_send()
From: Chris Wright @ 2006-03-20 20:18 UTC (permalink / raw)
  To: Ingo Oeser
  Cc: Chris Wright, Ingo Oeser, davem, linux-kernel, netdev,
	Andrew Morton
In-Reply-To: <200603201244.58507.netdev@axxeo.de>

* Ingo Oeser (netdev@axxeo.de) wrote:
> Hi Chris,
> 
> Andrew Morton wrote:
> > Ingo Oeser <ioe-lkml@rameria.de> wrote:
> > >
> > >  -int scm_send(struct socket *sock, struct msghdr *msg, struct scm_cookie *scm)
> > >  -{
> > >  -	struct task_struct *p = current;
> > >  -	scm->creds = (struct ucred) {
> > >  -		.uid = p->uid,
> > >  -		.gid = p->gid,
> > >  -		.pid = p->tgid
> > >  -	};
> > >  -	scm->fp = NULL;
> > >  -	scm->sid = security_sk_sid(sock->sk, NULL, 0);
> > >  -	scm->seq = 0;
> > >  -	if (msg->msg_controllen <= 0)
> > >  -		return 0;
> > >  -	return __scm_send(sock, msg, scm);
> > >  -}
> > 
> > It's worth noting that scm_send() will call security_sk_sid() even if
> > (msg->msg_controllen <= 0).
> 
> Chris, do you know if this is needed in this case?

This whole thing is looking broken.  I'm still trying to find the original
patch which caused the series of broken patches on top.

thanks,
-chris

^ permalink raw reply

* Re: [PATCH] scm: fold __scm_send() into scm_send()
From: Chris Wright @ 2006-03-20 21:36 UTC (permalink / raw)
  To: Catherine Zhang
  Cc: Ingo Oeser, Chris Wright, Ingo Oeser, davem, linux-kernel, netdev,
	Andrew Morton
In-Reply-To: <20060320201802.GS15997@sorel.sous-sol.org>

* Chris Wright (chrisw@sous-sol.org) wrote:
> * Ingo Oeser (netdev@axxeo.de) wrote:
> > Hi Chris,
> > 
> > Andrew Morton wrote:
> > > Ingo Oeser <ioe-lkml@rameria.de> wrote:
> > > >
> > > >  -int scm_send(struct socket *sock, struct msghdr *msg, struct scm_cookie *scm)
> > > >  -{
> > > >  -	struct task_struct *p = current;
> > > >  -	scm->creds = (struct ucred) {
> > > >  -		.uid = p->uid,
> > > >  -		.gid = p->gid,
> > > >  -		.pid = p->tgid
> > > >  -	};
> > > >  -	scm->fp = NULL;
> > > >  -	scm->sid = security_sk_sid(sock->sk, NULL, 0);
> > > >  -	scm->seq = 0;
> > > >  -	if (msg->msg_controllen <= 0)
> > > >  -		return 0;
> > > >  -	return __scm_send(sock, msg, scm);
> > > >  -}
> > > 
> > > It's worth noting that scm_send() will call security_sk_sid() even if
> > > (msg->msg_controllen <= 0).
> > 
> > Chris, do you know if this is needed in this case?
> 
> This whole thing is looking broken.  I'm still trying to find the original
> patch which caused the series of broken patches on top.

OK, it starts here from Catherine's patch:

include/net/scm.h::scm_recv()
+	if (test_bit(SOCK_PASSSEC, &sock->flags)) {
+		err = security_sid_to_context(scm->sid, &scontext, &scontext_len);
+		if (!err)
+ 			put_cmsg(msg, SOL_SOCKET, SCM_SECURITY, scontext_len, scontext);
+ }

Catherine, the security_sid_to_context() is a raw SELinux function which
crept into core code and should not have been there.  The fallout fixes
included conditionally exporting security_sid_to_context, and finally
scm_send/recv unlining.  The end result in -mm looks broken to me.
Specifically, it now does:

	ucred->uid = tsk->uid;
	ucred->gid = tsk->gid;
	ucred->pid = tsk->tgid;
	scm->fp = NULL;
	scm->seq = 0;
	if (msg->msg_controllen <= 0)
		return 0;

	scm->sid = security_sk_sid(sock->sk, NULL, 0);

The point of Catherine's original patch was to make sure there's always
a security identifier associated with AF_UNIX messages.  So receiver
can always check it (same as having credentials even w/out sender
control message passing them).  Now we will have garbage for sid.

thanks,
-chris

^ permalink raw reply

* Re: [PATCH] scm: fold __scm_send() into scm_send()
From: Andrew Morton @ 2006-03-20 22:31 UTC (permalink / raw)
  To: Chris Wright
  Cc: cxzhang, netdev, chrisw, ioe-lkml, davem, linux-kernel, netdev
In-Reply-To: <20060320213636.GT15997@sorel.sous-sol.org>

Chris Wright <chrisw@sous-sol.org> wrote:
>
> * Chris Wright (chrisw@sous-sol.org) wrote:
> > * Ingo Oeser (netdev@axxeo.de) wrote:
> > > Hi Chris,
> > > 
> > > Andrew Morton wrote:
> > > > Ingo Oeser <ioe-lkml@rameria.de> wrote:
> > > > >
> > > > >  -int scm_send(struct socket *sock, struct msghdr *msg, struct scm_cookie *scm)
> > > > >  -{
> > > > >  -	struct task_struct *p = current;
> > > > >  -	scm->creds = (struct ucred) {
> > > > >  -		.uid = p->uid,
> > > > >  -		.gid = p->gid,
> > > > >  -		.pid = p->tgid
> > > > >  -	};
> > > > >  -	scm->fp = NULL;
> > > > >  -	scm->sid = security_sk_sid(sock->sk, NULL, 0);
> > > > >  -	scm->seq = 0;
> > > > >  -	if (msg->msg_controllen <= 0)
> > > > >  -		return 0;
> > > > >  -	return __scm_send(sock, msg, scm);
> > > > >  -}
> > > > 
> > > > It's worth noting that scm_send() will call security_sk_sid() even if
> > > > (msg->msg_controllen <= 0).
> > > 
> > > Chris, do you know if this is needed in this case?
> > 
> > This whole thing is looking broken.  I'm still trying to find the original
> > patch which caused the series of broken patches on top.
> 
> OK, it starts here from Catherine's patch:
> 
> include/net/scm.h::scm_recv()
> +	if (test_bit(SOCK_PASSSEC, &sock->flags)) {
> +		err = security_sid_to_context(scm->sid, &scontext, &scontext_len);
> +		if (!err)
> + 			put_cmsg(msg, SOL_SOCKET, SCM_SECURITY, scontext_len, scontext);
> + }
> 
> Catherine, the security_sid_to_context() is a raw SELinux function which
> crept into core code and should not have been there.  The fallout fixes
> included conditionally exporting security_sid_to_context, and finally
> scm_send/recv unlining.

Yes.  So we're OK up the uninlining, right?

>  The end result in -mm looks broken to me.
> Specifically, it now does:
> 
> 	ucred->uid = tsk->uid;
> 	ucred->gid = tsk->gid;
> 	ucred->pid = tsk->tgid;
> 	scm->fp = NULL;
> 	scm->seq = 0;
> 	if (msg->msg_controllen <= 0)
> 		return 0;
> 
> 	scm->sid = security_sk_sid(sock->sk, NULL, 0);
> 
> The point of Catherine's original patch was to make sure there's always
> a security identifier associated with AF_UNIX messages.  So receiver
> can always check it (same as having credentials even w/out sender
> control message passing them).  Now we will have garbage for sid.

This answers the question I've been asking all and sundry for a week, thanks ;)

So:

- scm-fold-__scm_send-into-scm_send.patch is OK

- scm_send-speedup.patch is wrong

- Catherine's patch introduces a possibly-significant performance
  problem: we're now calling the expensive-on-SELinux security_sk_sid()
  more frequently than we used to.

- That "initialise scm->creds via a temporary struct" trick still
  generates bad code.


I actually have enough to be going on with here - I'll drop it all.

^ permalink raw reply

* Re: TSO and IPoIB performance degradation
From: David S. Miller @ 2006-03-20 23:00 UTC (permalink / raw)
  To: bcrl
  Cc: netdev, rdreier, rick.jones2, linux-kernel, openib-general,
	buytenh, arjan
In-Reply-To: <20060320150941.GD16108@kvack.org>

From: Benjamin LaHaise <bcrl@kvack.org>
Date: Mon, 20 Mar 2006 10:09:42 -0500

> Wouldn't it make sense to strech the ACK when the previous ACK is still in 
> the TX queue of the device?  I know that sort of behaviour was always an 
> issue on modem links where you don't want to send out redundant ACKs.

I thought about doing some similar trick with TSO, wherein we would
not defer a TSO send if all the previous packets sent are out of the
device transmit queue.  The idea was the prevent the pipe from ever
emptying which is the danger of deferring too much for TSO.

This has several problems.  It's hard to implement.  You have to
decide if you want precise state, thereby checking the TX descriptors.
Or you go for imprecise but easier to implement, which is very
imprecise and therefore not very useful state, by just checking the
SKB refcount or similar which means that you find out it's left the TX
queue after the TX purge interrupt which can be a long time after the
event and by then the pipe has empties which is what you were trying
to prevent.

Lastly, you don't want to touch remote cpu state which is what such
a hack is going to end up doing much of the time.

^ permalink raw reply

* Re: [PATCH] scm: fold __scm_send() into scm_send()
From: Chris Wright @ 2006-03-20 23:15 UTC (permalink / raw)
  To: Andrew Morton
  Cc: Chris Wright, cxzhang, netdev, ioe-lkml, davem, linux-kernel,
	netdev
In-Reply-To: <20060320143103.31b7d933.akpm@osdl.org>

* Andrew Morton (akpm@osdl.org) wrote:
> Chris Wright <chrisw@sous-sol.org> wrote:
> > Catherine, the security_sid_to_context() is a raw SELinux function which
> > crept into core code and should not have been there.  The fallout fixes
> > included conditionally exporting security_sid_to_context, and finally
> > scm_send/recv unlining.
> 
> Yes.  So we're OK up the uninlining, right?

Yes, although sid_to_context is meant to be analog to the other
get_peersec calls, and should really be made a proper part of the
interface (can be done later, correctness is the issue at hand).

> >  The end result in -mm looks broken to me.
> > Specifically, it now does:
> > 
> > 	ucred->uid = tsk->uid;
> > 	ucred->gid = tsk->gid;
> > 	ucred->pid = tsk->tgid;
> > 	scm->fp = NULL;
> > 	scm->seq = 0;
> > 	if (msg->msg_controllen <= 0)
> > 		return 0;
> > 
> > 	scm->sid = security_sk_sid(sock->sk, NULL, 0);
> > 
> > The point of Catherine's original patch was to make sure there's always
> > a security identifier associated with AF_UNIX messages.  So receiver
> > can always check it (same as having credentials even w/out sender
> > control message passing them).  Now we will have garbage for sid.
> 
> This answers the question I've been asking all and sundry for a week, thanks ;)
> So:
> 
> - scm-fold-__scm_send-into-scm_send.patch is OK

Yes.

> - scm_send-speedup.patch is wrong

Yes.

> - Catherine's patch introduces a possibly-significant performance
>   problem: we're now calling the expensive-on-SELinux security_sk_sid()
>   more frequently than we used to.

I don't expect security_sk_sid() to be terribly expensive.  It's not
an AVC check, it's just propagating a label.  But I've not done any
benchmarking on that.

thanks,
-chris

^ permalink raw reply

* Re: [PATCH] scm: fold __scm_send() into scm_send()
From: David S. Miller @ 2006-03-20 23:28 UTC (permalink / raw)
  To: chrisw; +Cc: cxzhang, netdev, ioe-lkml, linux-kernel, netdev, akpm
In-Reply-To: <20060320213636.GT15997@sorel.sous-sol.org>

From: Chris Wright <chrisw@sous-sol.org>
Date: Mon, 20 Mar 2006 13:36:36 -0800

> The point of Catherine's original patch was to make sure there's always
> a security identifier associated with AF_UNIX messages.  So receiver
> can always check it (same as having credentials even w/out sender
> control message passing them).  Now we will have garbage for sid.

I'm seriously considering backing out Catherine's AF_UNIX patch from
the net-2.6.17 tree before submitting it to Linus later today so that
none of this crap goes in right now.

It appears that this needs a lot more sorting out, so for now that's
probably the right thing to do.

^ permalink raw reply

* Re: [PATCH] scm: fold __scm_send() into scm_send()
From: Chris Wright @ 2006-03-20 23:43 UTC (permalink / raw)
  To: David S. Miller
  Cc: chrisw, cxzhang, netdev, ioe-lkml, linux-kernel, netdev, akpm
In-Reply-To: <20060320.152838.68858441.davem@davemloft.net>

* David S. Miller (davem@davemloft.net) wrote:
> From: Chris Wright <chrisw@sous-sol.org>
> Date: Mon, 20 Mar 2006 13:36:36 -0800
> 
> > The point of Catherine's original patch was to make sure there's always
> > a security identifier associated with AF_UNIX messages.  So receiver
> > can always check it (same as having credentials even w/out sender
> > control message passing them).  Now we will have garbage for sid.
> 
> I'm seriously considering backing out Catherine's AF_UNIX patch from
> the net-2.6.17 tree before submitting it to Linus later today so that
> none of this crap goes in right now.
> 
> It appears that this needs a lot more sorting out, so for now that's
> probably the right thing to do.

I won't object.  I checked your tree, it looks OK to me.  The actual
broken patch appears in -mm, and the security_sid_to_context snafu
is primarily cosmetic at this point (the exports, etc fixed the real
compilation issues AFAICT).  But, again, if you want to drop that's fine
w/ me.  I'm sure Catherine can cleanup and resend as needed.

thanks,
-chris

^ permalink raw reply

* Re: [PATCH] scm: fold __scm_send() into scm_send()
From: James Morris @ 2006-03-21  0:37 UTC (permalink / raw)
  To: David S. Miller
  Cc: chrisw, cxzhang, netdev, ioe-lkml, linux-kernel, netdev, akpm
In-Reply-To: <20060320.152838.68858441.davem@davemloft.net>

On Mon, 20 Mar 2006, David S. Miller wrote:

> I'm seriously considering backing out Catherine's AF_UNIX patch from
> the net-2.6.17 tree before submitting it to Linus later today so that
> none of this crap goes in right now.

I believe Catherine is away this week, so it's probably best to drop the 
code and wait till she gets back and we can get it 100% right.

Sorry, this is my fault, I should have caught this problem.



- James
-- 
James Morris
<jmorris@namei.org>

^ permalink raw reply

* Re: [PATCH] scm: fold __scm_send() into scm_send()
From: David S. Miller @ 2006-03-21  0:50 UTC (permalink / raw)
  To: jmorris; +Cc: chrisw, cxzhang, netdev, ioe-lkml, linux-kernel, netdev, akpm
In-Reply-To: <Pine.LNX.4.64.0603201936030.6749@excalibur.intercode>

From: James Morris <jmorris@namei.org>
Date: Mon, 20 Mar 2006 19:37:51 -0500 (EST)

> I believe Catherine is away this week, so it's probably best to drop the 
> code and wait till she gets back and we can get it 100% right.

Ok, agreed.

> Sorry, this is my fault, I should have caught this problem.

No worries.

^ permalink raw reply

* [PATCH] au1000_tx_timeout and promiscuous mode
From: elmar gerdes @ 2006-03-21  0:50 UTC (permalink / raw)
  To: netdev; +Cc: linux-mips

[-- Attachment #1: Type: TEXT/PLAIN, Size: 622 bytes --]


hello,

the attached patch fixes a problem I had when running 2 different
Au15xx-based boards (Au1500 and Au1550) in bridging mode under high load.
au1000_tx_timeout() would reset the device for which it was called, but
promiscuous mode was not re-established.

tested under 2.6.14, 2.6.16-rc6 and 2.4.31.

on the 2.6.1x systems, the timeout appears after some 5 seconds, under
2.4.31 I thought it did not appear, but after half an hour of full load
I've seen it there, too.  now this makes me wonder why it takes only a
few seconds to have a tx timeout under 2.6 and half an hour under 2.4.
any ideas?

regards,
 	elmar

[-- Attachment #2: Type: TEXT/PLAIN, Size: 10214 bytes --]

1) au1000_tx_timeout(): Re-establish promiscuous mode (when bridging)
   after the re-init.

2) Do not dereference device private structure but use netdev_priv()
   instead (according to Rubini et.al. "Linux Device Drivers", 3rd
   edition, o-Reilly.

Apply to kernel trees 2.6.x and 2.4.x.

Tested on
 - myCable XXS1500, Rev.B, Au1500-based, and
 - Kurz NVB, Au1550-based, not in kernel-tree.

Signed-off-by: elmar gerdes <elmar.gerdes@engel-kg.com>


--- a/drivers/net/au1000_eth.c
+++ b/drivers/net/au1000_eth.c
@@ -206,7 +206,7 @@
 		printk(KERN_ERR "bcm_5201_status error: NULL dev\n");
 		return -1;
 	}
-	aup = (struct au1000_private *) dev->priv;
+	aup = netdev_priv(dev);
 
 	mii_data = mdio_read(dev, aup->phy_addr, MII_STATUS);
 	if (mii_data & MII_STAT_LINK) {
@@ -293,7 +293,7 @@
 		printk(KERN_ERR "lsi_80227_status error: NULL dev\n");
 		return -1;
 	}
-	aup = (struct au1000_private *) dev->priv;
+	aup = netdev_priv(dev);
 
 	mii_data = mdio_read(dev, aup->phy_addr, MII_STATUS);
 	if (mii_data & MII_STAT_LINK) {
@@ -404,7 +404,7 @@
 		return -1;
 	}
 
-	aup = (struct au1000_private *) dev->priv;
+	aup = netdev_priv(dev);
 	mii_data = mdio_read(dev, aup->phy_addr, MII_STATUS);
 
 	if (mii_data & MII_STAT_LINK) {
@@ -486,7 +486,7 @@
 		printk(KERN_ERR "lxt971a_status error: NULL dev\n");
 		return -1;
 	}
-	aup = (struct au1000_private *) dev->priv;
+	aup = netdev_priv(dev);
 
 	mii_data = mdio_read(dev, aup->phy_addr, MII_STATUS);
 	if (mii_data & MII_STAT_LINK) {
@@ -571,7 +571,7 @@
 		printk(KERN_ERR "ks8995m_status error: NULL dev\n");
 		return -1;
 	}
-	aup = (struct au1000_private *) dev->priv;
+	aup = netdev_priv(dev);
 
 	mii_data = mdio_read(dev, aup->phy_addr, MII_STATUS);
 	if (mii_data & MII_STAT_LINK) {
@@ -664,7 +664,7 @@
 		return -1;
 	}
 
-	aup = (struct au1000_private *) dev->priv;
+	aup = netdev_priv(dev);
 	mii_data = mdio_read(dev, aup->phy_addr, MII_STATUS);
 
 	if (mii_data & MII_STAT_LINK) {
@@ -794,7 +794,7 @@
 
 static int mdio_read(struct net_device *dev, int phy_id, int reg)
 {
-	struct au1000_private *aup = (struct au1000_private *) dev->priv;
+	struct au1000_private *aup = netdev_priv(dev);
 	volatile u32 *mii_control_reg;
 	volatile u32 *mii_data_reg;
 	u32 timedout = 20;
@@ -854,7 +854,7 @@
 
 static void mdio_write(struct net_device *dev, int phy_id, int reg, u16 value)
 {
-	struct au1000_private *aup = (struct au1000_private *) dev->priv;
+	struct au1000_private *aup = netdev_priv(dev);
 	volatile u32 *mii_control_reg;
 	volatile u32 *mii_data_reg;
 	u32 timedout = 20;
@@ -911,7 +911,7 @@
 
 static int mii_probe (struct net_device * dev)
 {
-	struct au1000_private *aup = (struct au1000_private *) dev->priv;
+	struct au1000_private *aup = netdev_priv(dev);
 	int phy_addr;
 #ifdef CONFIG_MIPS_BOSPORUS
 	int phy_found=0;
@@ -1078,7 +1078,7 @@
 
 static void enable_rx_tx(struct net_device *dev)
 {
-	struct au1000_private *aup = (struct au1000_private *) dev->priv;
+	struct au1000_private *aup = netdev_priv(dev);
 
 	if (au1000_debug > 4)
 		printk(KERN_INFO "%s: enable_rx_tx\n", dev->name);
@@ -1089,7 +1089,7 @@
 
 static void hard_stop(struct net_device *dev)
 {
-	struct au1000_private *aup = (struct au1000_private *) dev->priv;
+	struct au1000_private *aup = netdev_priv(dev);
 
 	if (au1000_debug > 4)
 		printk(KERN_INFO "%s: hard stop\n", dev->name);
@@ -1103,7 +1103,7 @@
 {
 	int i;
 	u32 flags;
-	struct au1000_private *aup = (struct au1000_private *) dev->priv;
+	struct au1000_private *aup = netdev_priv(dev);
 
 	if (au1000_debug > 4) 
 		printk(KERN_INFO "%s: reset mac, aup %x\n", 
@@ -1240,7 +1240,7 @@
 
 static int au1000_setup_aneg(struct net_device *dev, u32 advertise)
 {
-	struct au1000_private *aup = (struct au1000_private *)dev->priv;
+	struct au1000_private *aup = netdev_priv(dev);
 	u16 ctl, adv;
 
 	/* Setup standard advertise */
@@ -1266,7 +1266,7 @@
 
 static int au1000_setup_forced(struct net_device *dev, int speed, int fd)
 {
-	struct au1000_private *aup = (struct au1000_private *)dev->priv;
+	struct au1000_private *aup = netdev_priv(dev);
 	u16 ctl;
 
 	ctl = mdio_read(dev, aup->phy_addr, MII_BMCR);
@@ -1297,7 +1297,7 @@
 static void
 au1000_start_link(struct net_device *dev, struct ethtool_cmd *cmd)
 {
-	struct au1000_private *aup = (struct au1000_private *)dev->priv;
+	struct au1000_private *aup = netdev_priv(dev);
 	u32 advertise;
 	int autoneg;
 	int forced_speed;
@@ -1333,7 +1333,7 @@
 
 static int au1000_get_settings(struct net_device *dev, struct ethtool_cmd *cmd)
 {
-	struct au1000_private *aup = (struct au1000_private *)dev->priv;
+	struct au1000_private *aup = netdev_priv(dev);
 	u16 link, speed;
 
 	cmd->supported = GENMII_DEFAULT_FEATURES;
@@ -1358,7 +1358,7 @@
 
 static int au1000_set_settings(struct net_device *dev, struct ethtool_cmd *cmd)
 {
-	 struct au1000_private *aup = (struct au1000_private *)dev->priv;
+	 struct au1000_private *aup = netdev_priv(dev);
 	  unsigned long features = GENMII_DEFAULT_FEATURES;
 
 	 if (!capable(CAP_NET_ADMIN))
@@ -1402,7 +1402,7 @@
 
 static int au1000_nway_reset(struct net_device *dev)
 {
-	struct au1000_private *aup = (struct au1000_private *)dev->priv;
+	struct au1000_private *aup = netdev_priv(dev);
 
 	if (!aup->want_autoneg)
 		return -EINVAL;
@@ -1415,7 +1415,7 @@
 static void
 au1000_get_drvinfo(struct net_device *dev, struct ethtool_drvinfo *info)
 {
-	struct au1000_private *aup = (struct au1000_private *)dev->priv;
+	struct au1000_private *aup = netdev_priv(dev);
 
 	strcpy(info->driver, DRV_NAME);
 	strcpy(info->version, DRV_VERSION);
@@ -1470,7 +1470,7 @@
 	printk("%s: Au1x Ethernet found at 0x%x, irq %d\n", 
 			dev->name, ioaddr, irq);
 
-	aup = dev->priv;
+	aup = netdev_priv(dev);
 
 	/* Allocate the data buffers */
 	/* Snooping works fine with eth on all au1xxx */
@@ -1637,7 +1637,7 @@
  */
 static int au1000_init(struct net_device *dev)
 {
-	struct au1000_private *aup = (struct au1000_private *) dev->priv;
+	struct au1000_private *aup = netdev_priv(dev);
 	u32 flags;
 	int i;
 	u32 control;
@@ -1689,7 +1689,7 @@
 static void au1000_timer(unsigned long data)
 {
 	struct net_device *dev = (struct net_device *)data;
-	struct au1000_private *aup = (struct au1000_private *) dev->priv;
+	struct au1000_private *aup = netdev_priv(dev);
 	unsigned char if_port;
 	u16 link, speed;
 
@@ -1742,7 +1742,7 @@
 static int au1000_open(struct net_device *dev)
 {
 	int retval;
-	struct au1000_private *aup = (struct au1000_private *) dev->priv;
+	struct au1000_private *aup = netdev_priv(dev);
 
 	if (au1000_debug > 4)
 		printk("%s: open: dev=%p\n", dev->name, dev);
@@ -1776,7 +1776,7 @@
 static int au1000_close(struct net_device *dev)
 {
 	u32 flags;
-	struct au1000_private *aup = (struct au1000_private *) dev->priv;
+	struct au1000_private *aup = netdev_priv(dev);
 
 	if (au1000_debug > 4)
 		printk("%s: close: dev=%p\n", dev->name, dev);
@@ -1804,7 +1804,7 @@
 	for (i = 0; i < num_ifs; i++) {
 		dev = iflist[i].dev;
 		if (dev) {
-			aup = (struct au1000_private *) dev->priv;
+			aup = netdev_priv(dev);
 			unregister_netdev(dev);
 			kfree(aup->mii);
 			for (j = 0; j < NUM_RX_DMA; j++) {
@@ -1829,7 +1829,7 @@
 static inline void 
 update_tx_stats(struct net_device *dev, u32 status, u32 pkt_len)
 {
-	struct au1000_private *aup = (struct au1000_private *) dev->priv;
+	struct au1000_private *aup = netdev_priv(dev);
 	struct net_device_stats *ps = &aup->stats;
 
 	ps->tx_packets++;
@@ -1861,7 +1861,7 @@
  */
 static void au1000_tx_ack(struct net_device *dev)
 {
-	struct au1000_private *aup = (struct au1000_private *) dev->priv;
+	struct au1000_private *aup = netdev_priv(dev);
 	volatile tx_dma_t *ptxd;
 
 	ptxd = aup->tx_dma_ring[aup->tx_tail];
@@ -1888,7 +1888,7 @@
  */
 static int au1000_tx(struct sk_buff *skb, struct net_device *dev)
 {
-	struct au1000_private *aup = (struct au1000_private *) dev->priv;
+	struct au1000_private *aup = netdev_priv(dev);
 	volatile tx_dma_t *ptxd;
 	u32 buff_stat;
 	db_dest_t *pDB;
@@ -1939,7 +1939,7 @@
 
 static inline void update_rx_stats(struct net_device *dev, u32 status)
 {
-	struct au1000_private *aup = (struct au1000_private *) dev->priv;
+	struct au1000_private *aup = netdev_priv(dev);
 	struct net_device_stats *ps = &aup->stats;
 
 	ps->rx_packets++;
@@ -1967,7 +1967,7 @@
  */
 static int au1000_rx(struct net_device *dev)
 {
-	struct au1000_private *aup = (struct au1000_private *) dev->priv;
+	struct au1000_private *aup = netdev_priv(dev);
 	struct sk_buff *skb;
 	volatile rx_dma_t *prxd;
 	u32 buff_stat, status;
@@ -2070,6 +2070,8 @@
 	printk(KERN_ERR "%s: au1000_tx_timeout: dev=%p\n", dev->name, dev);
 	reset_mac(dev);
 	au1000_init(dev);
+	/* Set promiscuous mode */
+	set_rx_mode(dev);
 	dev->trans_start = jiffies;
 	netif_wake_queue(dev);
 }
@@ -2093,7 +2095,7 @@
 
 static void set_rx_mode(struct net_device *dev)
 {
-	struct au1000_private *aup = (struct au1000_private *) dev->priv;
+	struct au1000_private *aup = netdev_priv(dev);
 
 	if (au1000_debug > 4) 
 		printk("%s: set_rx_mode: flags=%x\n", dev->name, dev->flags);
@@ -2127,7 +2129,7 @@
 
 static int au1000_ioctl(struct net_device *dev, struct ifreq *rq, int cmd)
 {
-	struct au1000_private *aup = (struct au1000_private *)dev->priv;
+	struct au1000_private *aup = netdev_priv(dev);
 	u16 *data = (u16 *)&rq->ifr_ifru;
 
 	switch(cmd) { 
@@ -2154,7 +2156,7 @@
 
 static int au1000_set_config(struct net_device *dev, struct ifmap *map)
 {
-	struct au1000_private *aup = (struct au1000_private *) dev->priv;
+	struct au1000_private *aup = netdev_priv(dev);
 	u16 control;
 
 	if (au1000_debug > 4)  {
@@ -2248,7 +2250,7 @@
 
 static struct net_device_stats *au1000_get_stats(struct net_device *dev)
 {
-	struct au1000_private *aup = (struct au1000_private *) dev->priv;
+	struct au1000_private *aup = netdev_priv(dev);
 
 	if (au1000_debug > 4)
 		printk("%s: au1000_get_stats: dev=%p\n", dev->name, dev);

^ permalink raw reply

* Re: [PATCH] scm: fold __scm_send() into scm_send()
From: Stephen Smalley @ 2006-03-21 13:32 UTC (permalink / raw)
  To: Chris Wright
  Cc: James Morris, Andrew Morton, cxzhang, netdev, ioe-lkml, davem,
	linux-kernel, netdev
In-Reply-To: <20060320231508.GV15997@sorel.sous-sol.org>

On Mon, 2006-03-20 at 15:15 -0800, Chris Wright wrote:
> * Andrew Morton (akpm@osdl.org) wrote:
> > Chris Wright <chrisw@sous-sol.org> wrote:
> > > Catherine, the security_sid_to_context() is a raw SELinux function which
> > > crept into core code and should not have been there.  The fallout fixes
> > > included conditionally exporting security_sid_to_context, and finally
> > > scm_send/recv unlining.
> > 
> > Yes.  So we're OK up the uninlining, right?
> 
> Yes, although sid_to_context is meant to be analog to the other
> get_peersec calls, and should really be made a proper part of the
> interface (can be done later, correctness is the issue at hand).

Yes, Catherine was told that she shouldn't be directly exporting
security_sid_to_context, and was allegedly working on a fix.  Note
however that the expected solution is not a LSM interface but a set of
properly encapsulated interfaces exported directly from SELinux, based
on the iptables context matching patches by James.  The same style of
interface is being put forth for the audit LSPP work.  The indirection
of LSM serves no purpose here, as these users are specifically looking
for functionality provided only by SELinux.

> I don't expect security_sk_sid() to be terribly expensive.  It's not
> an AVC check, it's just propagating a label.  But I've not done any
> benchmarking on that.

No permission check there, but it looks like it does read lock
sk_callback_lock.  Not sure if that is truly justified here.

-- 
Stephen Smalley
National Security Agency

^ permalink raw reply

* Re: [PATCH] scm: fold __scm_send() into scm_send()
From: Stephen Smalley @ 2006-03-21 13:42 UTC (permalink / raw)
  To: Chris Wright
  Cc: James Morris, Andrew Morton, cxzhang, netdev, ioe-lkml, davem,
	linux-kernel, netdev
In-Reply-To: <1142947952.28120.29.camel@moss-spartans.epoch.ncsc.mil>

On Tue, 2006-03-21 at 08:32 -0500, Stephen Smalley wrote:
> > I don't expect security_sk_sid() to be terribly expensive.  It's not
> > an AVC check, it's just propagating a label.  But I've not done any
> > benchmarking on that.
> 
> No permission check there, but it looks like it does read lock
> sk_callback_lock.  Not sure if that is truly justified here.

Ah, that is because it is also called from the xfrm code, introduced by
Trent's patches.  But that locking shouldn't be necessary from scm_send,
right?  So she likely wants a separate hook for it to avoid that
overhead, or even just a direct SELinux interface?
  
-- 
Stephen Smalley
National Security Agency

^ permalink raw reply

* Re: Please pull bcm43xx softmac-upstream and dscape-upstream branches
From: John W. Linville @ 2006-03-21 19:09 UTC (permalink / raw)
  To: Michael Buesch
  Cc: netdev-u79uwXL29TY76Z2rM5mHXA, bcm43xx-dev-0fE9KPoRgkgATYTw5x5z8w
In-Reply-To: <200603052147.55557.mbuesch-KuiJ5kEpwI6ELgA04lAiVw@public.gmane.org>

On Sun, Mar 05, 2006 at 09:47:55PM +0100, Michael Buesch wrote:

> Please pull branches "softmac-upstream" and "dscape-upstream"
> from my repository at:
> git://bu3sch.de/wireless-2.6.git

Merged to softmac and dscape branches of wireless-2.6...thanks!

John
-- 
John W. Linville
linville-2XuSBdqkA4R54TAoqtyWWQ@public.gmane.org

^ permalink raw reply

* Re: [PATCH] wireless.git: update acxsm to 0.4.7
From: John W. Linville @ 2006-03-21 19:10 UTC (permalink / raw)
  To: Denis Vlasenko
  Cc: acx100-devel, Carlos Martín, netdev, Christoph Hellwig
In-Reply-To: <200603011558.14968.vda@ilport.com.ua>

On Wed, Mar 01, 2006 at 03:58:14PM +0200, Denis Vlasenko wrote:
> On Tuesday 28 February 2006 03:34, John W. Linville wrote:
> > On Mon, Feb 27, 2006 at 11:44:38AM +0100, Carlos Martín wrote:
> > > On Monday 27 February 2006 11:20, Denis Vlasenko wrote:
> > > > > Comments are welcome and I'll split the patch if needed.
> > 
> > Denis are you applying this patch to your tree?  If so, I'll rely on
> > you to push it to me when you are ready.
> > 
> > If not, then I will need Carlos to generate the diffs so that they
> > can be applied to the top of the tree with -p1.
> > 
> > 	http://linux.yyz.us/patch-format.html
> 
> Changelog:

Merged to softmac branch of wireless-2.6...thanks!

John
-- 
John W. Linville
linville@tuxdriver.com


-------------------------------------------------------
This SF.Net email is sponsored by xPML, a groundbreaking scripting language
that extends applications into web and mobile media. Attend the live webcast
and join the prime developer group breaking into this new coding territory!
http://sel.as-us.falkag.net/sel?cmd=lnk&kid\x110944&bid$1720&dat\x121642

^ permalink raw reply

* Re: [PATCH 4/6 v2] IB: address translation to map IP toIB addresses (GIDs)
From: Roland Dreier @ 2006-03-21 20:57 UTC (permalink / raw)
  To: Sean Hefty; +Cc: linux-kernel, netdev, openib-general
In-Reply-To: <ORSMSX401FRaqbC8wSA0000000d@orsmsx401.amr.corp.intel.com>

 > +struct workqueue_struct *rdma_wq;
 > +EXPORT_SYMBOL(rdma_wq);

Sean, I don't think I saw an answer when I asked you this before.  Why
is ib_addr exporting a workqueue?  Is there some sort of ordering
constraint that is forcing other modules to go through the same
workqueue for things?

This seems like a very fragile internal thing to be exposing, and I'm
wondering if there's a better way to handle it.

 - R.

^ permalink raw reply

* Re: Re: [PATCH 4/6 v2] IB: address translation to map IP toIB addresses (GIDs)
From: Sean Hefty @ 2006-03-21 21:08 UTC (permalink / raw)
  To: Roland Dreier; +Cc: netdev, linux-kernel, openib-general
In-Reply-To: <adabqvza53f.fsf@cisco.com>

Roland Dreier wrote:
>  > +struct workqueue_struct *rdma_wq;
>  > +EXPORT_SYMBOL(rdma_wq);
> 
> Sean, I don't think I saw an answer when I asked you this before.  Why
> is ib_addr exporting a workqueue?  Is there some sort of ordering
> constraint that is forcing other modules to go through the same
> workqueue for things?
> 
> This seems like a very fragile internal thing to be exposing, and I'm
> wondering if there's a better way to handle it.

I responded in a different thread, but here's what I wrote:

"This is simply an attempt to reduce/combine work queues used by the Infiniband 
code.  This keeps the threading a little simpler in the rdma_cm, since all 
callbacks are invoked using the same work queue.  (I'm also using this with the 
local SA/multicast code, but that's not ready for merging.)"

There's no specific ordering constraint that's required.  We're just ending up 
with several Infiniband modules creating their own work queues (ib_mad, ib_cm, 
ib_addr, rdma_cm, plus a couple more in modules under development), and this is 
an attempt to reduce that.  If having separate work queues would work better, 
there shouldn't be anything that prevents this.

- Sean

^ permalink raw reply

* [git patches] net driver updates
From: Jeff Garzik @ 2006-03-21 21:55 UTC (permalink / raw)
  To: netdev; +Cc: linux-kernel

[just pushed upstream; patch too big to inline]

Please pull from 'upstream-linus' branch of
master.kernel.org:/pub/scm/linux/kernel/git/jgarzik/netdev-2.6.git

to receive the following updates:

 Documentation/networking/e100.txt  |  158 -
 Documentation/networking/e1000.txt |  634 +++--
 MAINTAINERS                        |   16 
 drivers/net/mv643xx_eth.h          |   18 
 drivers/net/pcnet32.c              | 4161 +++++++++++++++++++------------------
 drivers/net/skfp/fplustm.c         |   12 
 drivers/net/skge.c                 |  275 +-
 drivers/net/skge.h                 |    1 
 drivers/net/sky2.c                 |  583 ++---
 drivers/net/sky2.h                 |   22 
 drivers/net/smc91x.c               |   53 
 drivers/net/smc91x.h               |  478 ++--
 12 files changed, 3467 insertions(+), 2944 deletions(-)

Andrew Morton:
      skfp warning fixes

Dale Farnsworth:
      mv643xx_eth: Cache align skb->data if CONFIG_NOT_COHERENT_CACHE

Don Fry:
      pcnet32: support boards with multiple phys

Jeff Garzik:
      [netdrvr] pcnet32: Lindent
      [netdrvr] pcnet32: other source formatting cleanups

Jesse Brandeburg:
      e100/e1000/ixgb: update MAINTAINERS to current developers
      e100: update e100.txt
      e1000: update the readme with the latest text

Nicolas Pitre:
      smc91x: allow for dynamic bus access configs

Stephen Hemminger:
      skge: use NAPI for tx cleanup.
      skge: use auto masking of irqs
      skge: check the allocation of ring buffer
      skge: dma configuration cleanup
      skge: use kcalloc
      skge: use mmiowb
      skge: formmating and whitespace cleanup
      skge: handle pci errors better
      skge: version 1.4
      sky2: remove support for untested Yukon EC/rev 0
      sky2: drop broken wake on lan support
      sky2: rework of NAPI and IRQ management
      sky2: coalescing parameters
      sky2: add MSI support
      sky2: whitespace fixes
      sky2: transmit recovery
      sky2: handle all error irqs
      sky2 version 1.1

^ permalink raw reply

* Re: Re: [PATCH 4/6 v2] IB: address translation to map IP toIB addresses (GIDs)
From: Roland Dreier @ 2006-03-21 22:39 UTC (permalink / raw)
  To: Sean Hefty; +Cc: netdev, linux-kernel, openib-general
In-Reply-To: <44206B53.8020701@ichips.intel.com>

    Sean> "This is simply an attempt to reduce/combine work queues
    Sean> used by the Infiniband code.  This keeps the threading a
    Sean> little simpler in the rdma_cm, since all callbacks are
    Sean> invoked using the same work queue.  (I'm also using this
    Sean> with the local SA/multicast code, but that's not ready for
    Sean> merging.)"

How does it keep the threading model simpler?  Is this an inter-module
dependency.

    Sean> There's no specific ordering constraint that's required.
    Sean> We're just ending up with several Infiniband modules
    Sean> creating their own work queues (ib_mad, ib_cm, ib_addr,
    Sean> rdma_cm, plus a couple more in modules under development),
    Sean> and this is an attempt to reduce that.  If having separate
    Sean> work queues would work better, there shouldn't be anything
    Sean> that prevents this.

It seems like it would be cleaner for each module to have its own
workqueue if it needs one.  There's also schedule_work(), although
that goes to a multi-threaded workqueue.  Michael Tsirkin has
suggested creating a system-wide single-threaded workqueue (ie
something like schedule_ordered_work()) for everyone that occasionally
needs a single-threaded workqueue.

 - R.

^ permalink raw reply

* Re: [iproute2] IPoIB link layer address bug
From: Stephen Hemminger @ 2006-03-21 23:56 UTC (permalink / raw)
  To: James Lentini; +Cc: netdev, openib-general, lartc
In-Reply-To: <Pine.LNX.4.61.0603161707360.23670@jlentini-linux.nane.netapp.com>

On Thu, 16 Mar 2006 17:24:41 -0500 (EST)
James Lentini <jlentini@netapp.com> wrote:

> 
> The ip(8) command has a bug when dealing with IPoIB link layer 
> addresses. Specifically it does not correctly handle the addition of 
> new entries in the neighbor/arp table. For example, this command will 
> fail:
> 
> ip neigh add 192.168.0.138 lladdr 00:00:04:04:fe:80:00:00:00:00:00:00:00:01:73:00:00:00:8a:91 nud permanent dev ib0
> 
> An IPoIB link layer address is 20-bytes (see 
> http://www.ietf.org/internet-drafts/draft-ietf-ipoib-ip-over-infiniband-09.txt, 
> section 9.1.1).
> 
> The command line parsing code expects link layer addresses to be a 
> maximum of 16-bytes. Addresses over 16-bytes are truncated.
> 
> This patch (against the iproute2 cvs repository) fixes the problem:
> 

Okay, but there are number of other places in iproute2 that call ll_addr_a2n()
with ifr.ifr_hwaddr.sa_data. And that is 14 bytes.  If you want to fix those
it will be harder since it would increase the sizeof(struct sockaddr) and potentially
break compatibility.

^ permalink raw reply

* Re: Re: [iproute2] IPoIB link layer address bug
From: Jason Gunthorpe @ 2006-03-22  1:30 UTC (permalink / raw)
  To: Stephen Hemminger; +Cc: netdev, openib-general, lartc
In-Reply-To: <20060321155617.3ae419b2@localhost.localdomain>

On Tue, Mar 21, 2006 at 03:56:17PM -0800, Stephen Hemminger wrote:

> Okay, but there are number of other places in iproute2 that call ll_addr_a2n()
> with ifr.ifr_hwaddr.sa_data. And that is 14 bytes.  If you want to fix those
> it will be harder since it would increase the sizeof(struct sockaddr) and potentially
> break compatibility.

Maybe the best thing is to upgrade ip (and or netlink?) to use netlink
messages instead of ioctls for the remaining problematic operations.
Since netlink already supports an arbitary length hwaddr there should
be no compatability problem.

Just browsing I see usages of SIOCSIFHWBROADCAST, SIOCSIFHWADDR,
SIOCADDMULTI, SIOCDELMULTI and SIOCGIFHWADDR that use a struct ifreq..

I know SIOCGIFHWADDR can be done over netlink, but I'm not too
familiar with the others..

Jason

^ permalink raw reply

* Re: [PATCH 6/6 v2] IB: userspace support for RDMA connection manager
From: Roland Dreier @ 2006-03-22  1:40 UTC (permalink / raw)
  To: Sean Hefty; +Cc: netdev, linux-kernel, openib-general
In-Reply-To: <ORSMSX4011XvpFVjCRG0000000f@orsmsx401.amr.corp.intel.com>

I added this patch to the rdma_cm branch in my git tree.  When I was
doing that, I noticed that it builds rdma_ucm.ko unconditionally.  It
seems that we want this to depend on CONFIG_INFINIBAND_USER_ACCESS,
since that controls ib_uverbs.ko and ib_ucm.ko.

To do this I rejiggered the Kconfig and Makefile changes I made
before.  I made CONFIG_INFINIBAND_ADDR_TRANS into a bool (instead of a
tristate), so that it's 'y' if INFINIBAND and INET are on, and made
the top of the Makefile look like:

infiniband-$(CONFIG_INFINIBAND_ADDR_TRANS)	:= ib_addr.o rdma_cm.o
user_access-$(CONFIG_INFINIBAND_ADDR_TRANS)	:= rdma_ucm.o

obj-$(CONFIG_INFINIBAND) +=		ib_core.o ib_mad.o ib_sa.o \
					ib_cm.o $(infiniband-y)
obj-$(CONFIG_INFINIBAND_USER_MAD) +=	ib_umad.o
obj-$(CONFIG_INFINIBAND_USER_ACCESS) +=	ib_uverbs.o ib_ucm.o $(user_access-y)

I'm pretty sure this does exactly what we want.

 - R.

^ permalink raw reply

* Re: [PATCH 2.6.16-rc6 1/1] ipw2200: Add Kconfig entries for QOS and Monitor mode
From: Zhu Yi @ 2006-03-22  3:11 UTC (permalink / raw)
  To: Andreas Happe; +Cc: Adrian Bunk, Andrew Morton, linux-kernel, jgarzik, netdev
In-Reply-To: <20060318174703.GA22072@localdomain>

On Sat, 2006-03-18 at 18:47 +0100, Andreas Happe wrote:
> Adds Kconfig entries for enabling Monitor mode and Quality of service
> to the ipw2200 driver. It also renames the IPW_QOS define to
> IPW2200_QOS.
> 
> As Monitor mode generates lots of firmware errors it depends upon
> BROKEN. QOS is under development, so it depends upon EXPERIMENTAL.

Ack the rename and QoS description changes.

The IPW2200_MONITOR and monitor mode firmware error are already fixed in
wireless-2.6 GIT
http://kernel.org/git/?p=linux/kernel/git/linville/wireless-2.6.git;a=summary

Wireless related development happens there. I'd suggest you create
patches against that tree.

Thanks,
-yi

^ permalink raw reply


This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox