Netdev List

Netdev List
 help / color / mirror / Atom feed

* Re: [patch 1/4] net: percpufy frequently used vars -- add percpu_counter_mod_bh
From: Ravikiran G Thirumalai @ 2006-03-08 22:25 UTC (permalink / raw)
  To: Benjamin LaHaise; +Cc: Andrew Morton, linux-kernel, davem, netdev, shai
In-Reply-To: <20060308211733.GA5410@kvack.org>

On Wed, Mar 08, 2006 at 04:17:33PM -0500, Benjamin LaHaise wrote:
> On Wed, Mar 08, 2006 at 01:07:26PM -0800, Ravikiran G Thirumalai wrote:
> 
> Last time I checked, all the major architectures had efficient local_t 
> implementations.  Most of the RISC CPUs are able to do a load / store 
> conditional implementation that is the same cost (since memory barriers 
> tend to be explicite on powerpc).  So why not use it?

Then, for the batched percpu_counters, we could gain by using local_t only for 
the UP case. But we will have to have a new local_long_t implementation 
for that.  Do you think just one use case of local_long_t warrants for a new
set of apis?

Kiran

^ permalink raw reply

* Re: [PATCH, RESEND] Add MWI workaround for Tulip DC21143
From: Francois Romieu @ 2006-03-08 22:41 UTC (permalink / raw)
  To: Geert Uytterhoeven
  Cc: Ralf Baechle, Martin Michlmayr, netdev, Linux/MIPS Development,
	P. Horton
In-Reply-To: <Pine.LNX.4.62.0603071031520.5292@pademelon.sonytel.be>

Geert Uytterhoeven <geert@linux-m68k.org> :
> On Tue, 7 Mar 2006, Ralf Baechle wrote:
[...]
> > I'm just not convinced of having such a workaround as a build option.
> > The average person building a a kernel will probably not know if the
> > option needs to be enabled or not.
> 
> Indeed, if it's mentioned in the errata of the chip, the driver should take
> care of it.

Something like the patch below (+Mr Horton Signed-off-by: and description):

diff --git a/drivers/net/tulip/tulip.h b/drivers/net/tulip/tulip.h
index 05d2d96..d109540 100644
--- a/drivers/net/tulip/tulip.h
+++ b/drivers/net/tulip/tulip.h
@@ -262,7 +262,14 @@ enum t21143_csr6_bits {
 #define RX_RING_SIZE	128 
 #define MEDIA_MASK     31
 
+/* MWI can fail on 21143 rev 65 if the receive buffer ends
+   on a cache line boundary. Ensure it doesn't ... */
+
+#ifdef CONFIG_MIPS_COBALT
+#define PKT_BUF_SZ		(1536 + 4)
+#else
 #define PKT_BUF_SZ		1536	/* Size of each temporary Rx buffer. */
+#endif
 
 #define TULIP_MIN_CACHE_LINE	8	/* in units of 32-bit words */
 
diff --git a/drivers/net/tulip/tulip_core.c b/drivers/net/tulip/tulip_core.c
index c67c912..ca6eeda 100644
--- a/drivers/net/tulip/tulip_core.c
+++ b/drivers/net/tulip/tulip_core.c
@@ -294,6 +294,8 @@ static void tulip_up(struct net_device *
 	if (tp->mii_cnt  ||  (tp->mtable  &&  tp->mtable->has_mii))
 		iowrite32(0x00040000, ioaddr + CSR6);
 
+	printk(KERN_DEBUG "%s: CSR0 %08x\n", dev->name, tp->csr0);
+
 	/* Reset the chip, holding bit 0 set at least 50 PCI cycles. */
 	iowrite32(0x00000001, ioaddr + CSR0);
 	udelay(100);
@@ -1155,8 +1157,10 @@ static void __devinit tulip_mwi_config (
 	/* if we have any cache line size at all, we can do MRM */
 	csr0 |= MRM;
 
+#ifndef CONFIG_MIPS_COBALT
 	/* ...and barring hardware bugs, MWI */
 	if (!(tp->chip_id == DC21143 && tp->revision == 65))
+#endif
 		csr0 |= MWI;
 
 	/* set or disable MWI in the standard PCI command bit.
@@ -1182,7 +1186,7 @@ static void __devinit tulip_mwi_config (
 	 */
 	switch (cache) {
 	case 8:
-		csr0 |= MRL | (1 << CALShift) | (16 << BurstLenShift);
+		csr0 |= MRL | (1 << CALShift) | (8 << BurstLenShift);
 		break;
 	case 16:
 		csr0 |= MRL | (2 << CALShift) | (16 << BurstLenShift);

^ permalink raw reply related

* Re: [patch 1/4] net: percpufy frequently used vars -- add percpu_counter_mod_bh
From: Benjamin LaHaise @ 2006-03-08 22:41 UTC (permalink / raw)
  To: Ravikiran G Thirumalai; +Cc: Andrew Morton, linux-kernel, davem, netdev, shai
In-Reply-To: <20060308222528.GE4493@localhost.localdomain>

On Wed, Mar 08, 2006 at 02:25:28PM -0800, Ravikiran G Thirumalai wrote:
> Then, for the batched percpu_counters, we could gain by using local_t only for 
> the UP case. But we will have to have a new local_long_t implementation 
> for that.  Do you think just one use case of local_long_t warrants for a new
> set of apis?

I think it may make more sense to simply convert local_t into a long, given 
that most of the users will be things like stats counters.

		-ben
-- 
"Time is of no importance, Mr. President, only life is important."
Don't Email: <dont@kvack.org>.

^ permalink raw reply

* Re: [patch 1/4] net: percpufy frequently used vars -- add percpu_counter_mod_bh
From: Andrew Morton @ 2006-03-08 23:06 UTC (permalink / raw)
  To: Ravikiran G Thirumalai; +Cc: bcrl, linux-kernel, davem, netdev, shai
In-Reply-To: <20060308222528.GE4493@localhost.localdomain>

Ravikiran G Thirumalai <kiran@scalex86.org> wrote:
>
> On Wed, Mar 08, 2006 at 04:17:33PM -0500, Benjamin LaHaise wrote:
> > On Wed, Mar 08, 2006 at 01:07:26PM -0800, Ravikiran G Thirumalai wrote:
> > 
> > Last time I checked, all the major architectures had efficient local_t 
> > implementations.  Most of the RISC CPUs are able to do a load / store 
> > conditional implementation that is the same cost (since memory barriers 
> > tend to be explicite on powerpc).  So why not use it?
> 
> Then, for the batched percpu_counters, we could gain by using local_t only for 
> the UP case. But we will have to have a new local_long_t implementation 
> for that.  Do you think just one use case of local_long_t warrants for a new
> set of apis?
> 

local_t maps onto 32-bit values on 32-bit machines and onto 64-bit values
on 64-bit machines.  unsigned longs.  I don't quite trust the signedness
handling across all archs.

<looks>

Yes, alpha (for example) went and made its local_t's signed, which is wrong
and dangerous.

ia64 is signed.

mips is signed.

parisc is signed.

s390 is signed.

sparc64 is signed.

x86_64 is signed 32-bit!

All other architectures use unsigned long.  A fiasco.

Once decrapify-asm-generic-localh.patch is merged I think all architectures
can and should use asm-generic/local.h.

Until decrapify-asm-generic-localh.patch has been merged and the downstream
arch consolidation has happened and the signedness problems have been
carefully reviewed and fixed I wouldn't go within a mile of local_t.

Once all that is sorted out then yes, it makes sense to convert per-cpu
counters to local_t.  Note that local_t is unsigned, and percpu_counter
needs to treat it as signed.

We should also move the out-of-line percpu_counter implementation over to
lib/something.c (in obj-y).

But none of that has anything to do with these patches.

^ permalink raw reply

* Re: [patch 1/4] net: percpufy frequently used vars -- add percpu_counter_mod_bh
From: Andrew Morton @ 2006-03-08 23:12 UTC (permalink / raw)
  To: kiran, bcrl, linux-kernel, davem, netdev, shai
In-Reply-To: <20060308150609.344c62fa.akpm@osdl.org>

Andrew Morton <akpm@osdl.org> wrote:
>
> Once decrapify-asm-generic-localh.patch is merged I think all architectures
>  can and should use asm-generic/local.h.

err, no.  Because that's just atomic_long_t, and that's a locked instruction.

We need to review and fix up those architectures which have implemented the
optimised versions.

^ permalink raw reply

* Re: [patch 1/4] net: percpufy frequently used vars -- add percpu_counter_mod_bh
From: Andrew Morton @ 2006-03-08 23:43 UTC (permalink / raw)
  To: Benjamin LaHaise; +Cc: kiran, linux-kernel, davem, netdev, shai
In-Reply-To: <20060308224140.GC5410@kvack.org>

Benjamin LaHaise <bcrl@kvack.org> wrote:
>
> On Wed, Mar 08, 2006 at 02:25:28PM -0800, Ravikiran G Thirumalai wrote:
> > Then, for the batched percpu_counters, we could gain by using local_t only for 
> > the UP case. But we will have to have a new local_long_t implementation 
> > for that.  Do you think just one use case of local_long_t warrants for a new
> > set of apis?
> 
> I think it may make more sense to simply convert local_t into a long, given 
> that most of the users will be things like stats counters.
> 

Yes, I agree that making local_t signed would be better.  It's consistent
with atomic_t, atomic64_t and atomic_long_t and it's a bit more flexible.

Perhaps.  A lot of applications would just be upcounters for statistics,
where unsigned is desired.  But I think the consistency argument wins out.

^ permalink raw reply

* Re: [patch 1/4] net: percpufy frequently used vars -- add percpu_counter_mod_bh
From: Ravikiran G Thirumalai @ 2006-03-09  0:18 UTC (permalink / raw)
  To: Andrew Morton; +Cc: Benjamin LaHaise, linux-kernel, davem, netdev, shai
In-Reply-To: <20060308154321.0e779111.akpm@osdl.org>

On Wed, Mar 08, 2006 at 03:43:21PM -0800, Andrew Morton wrote:
> Benjamin LaHaise <bcrl@kvack.org> wrote:
> >
> > I think it may make more sense to simply convert local_t into a long, given 
> > that most of the users will be things like stats counters.
> > 
> 
> Yes, I agree that making local_t signed would be better.  It's consistent
> with atomic_t, atomic64_t and atomic_long_t and it's a bit more flexible.
> 
> Perhaps.  A lot of applications would just be upcounters for statistics,
> where unsigned is desired.  But I think the consistency argument wins out.

It already is... for most of the arches except x86_64.
And on -mm, the asm-generic version uses atomic_long_t for local_t (signed
long) which seems right.

Although, I wonder why we use:

#define local_read(l) ((unsigned long)atomic_long_read(&(l)->a))

It would return a huge value if the local counter was even -1 no?

^ permalink raw reply

* Re: [patch 1/4] net: percpufy frequently used vars -- add percpu_counter_mod_bh
From: Andrew Morton @ 2006-03-09  0:32 UTC (permalink / raw)
  To: Ravikiran G Thirumalai; +Cc: bcrl, linux-kernel, davem, netdev, shai
In-Reply-To: <20060309001803.GF4493@localhost.localdomain>

Ravikiran G Thirumalai <kiran@scalex86.org> wrote:
>
> On Wed, Mar 08, 2006 at 03:43:21PM -0800, Andrew Morton wrote:
> > Benjamin LaHaise <bcrl@kvack.org> wrote:
> > >
> > > I think it may make more sense to simply convert local_t into a long, given 
> > > that most of the users will be things like stats counters.
> > > 
> > 
> > Yes, I agree that making local_t signed would be better.  It's consistent
> > with atomic_t, atomic64_t and atomic_long_t and it's a bit more flexible.
> > 
> > Perhaps.  A lot of applications would just be upcounters for statistics,
> > where unsigned is desired.  But I think the consistency argument wins out.
> 
> It already is... for most of the arches except x86_64.

x86 uses unsigned long.

> And on -mm, the asm-generic version uses atomic_long_t for local_t (signed
> long) which seems right.

No, it uses unsigned long.  The only place where signedness matters is
local_read(), and there it is typecast to ulong.

> Although, I wonder why we use:
> 
> #define local_read(l) ((unsigned long)atomic_long_read(&(l)->a))
> 
> It would return a huge value if the local counter was even -1 no?

It's casting a signed long to an unsigned long.  That does the right thing.
Yes, it'll convert -1 to 0xffffffff[ffffffff].

^ permalink raw reply

* Re: [PATCH] compat. ifconf: fix limits
From: David S. Miller @ 2006-03-09  0:46 UTC (permalink / raw)
  To: rdunlap; +Cc: netdev, linux-fsdevel, Alexandra.Kossovsky, ak, akpm, torvalds
In-Reply-To: <20060308091608.c56360dd.rdunlap@xenotime.net>

From: "Randy.Dunlap" <rdunlap@xenotime.net>
Date: Wed, 8 Mar 2006 09:16:08 -0800

> From: Randy Dunlap <rdunlap@xenotime.net>
> 
> A recent change to compat. dev_ifconf() in fs/compat_ioctl.c
> causes ifconf data to be truncated 1 entry too early when copying it
> to userspace.  The correct amount of data (length) is returned,
> but the final entry is empty (zero, not filled in).
> The for-loop 'i' check should use <= to allow the final struct
> ifreq32 to be copied.  I also used the ifconf-corruption program
> in kernel bugzilla #4746 to make sure that this change does not
> re-introduce the corruption.
> 
> Signed-off-by: Randy Dunlap <rdunlap@xenotime.net>

Good catch, applied.  Thanks Randy.

Is this one relevant for -stable?

^ permalink raw reply

* Re: [PATCH] compat. ifconf: fix limits
From: David S. Miller @ 2006-03-09  1:41 UTC (permalink / raw)
  To: rdunlap; +Cc: netdev, linux-fsdevel, Alexandra.Kossovsky, ak, akpm, torvalds
In-Reply-To: <20060308174116.7cae35e1.rdunlap@xenotime.net>

From: "Randy.Dunlap" <rdunlap@xenotime.net>
Date: Wed, 8 Mar 2006 17:41:16 -0800

> On Wed, 08 Mar 2006 16:46:27 -0800 (PST) David S. Miller wrote:
> 
> > Is this one relevant for -stable?
> 
> Yes, IMO.  Have to wait for it to be merged upstream, right?

I'll take care of everything, thanks Randy.

^ permalink raw reply

* Re: [PATCH] compat. ifconf: fix limits
From: Randy.Dunlap @ 2006-03-09  1:41 UTC (permalink / raw)
  To: David S. Miller
  Cc: netdev, linux-fsdevel, Alexandra.Kossovsky, ak, akpm, torvalds
In-Reply-To: <20060308.164627.81771250.davem@davemloft.net>

On Wed, 08 Mar 2006 16:46:27 -0800 (PST) David S. Miller wrote:

> From: "Randy.Dunlap" <rdunlap@xenotime.net>
> Date: Wed, 8 Mar 2006 09:16:08 -0800
> 
> > From: Randy Dunlap <rdunlap@xenotime.net>
> > 
> > A recent change to compat. dev_ifconf() in fs/compat_ioctl.c
> > causes ifconf data to be truncated 1 entry too early when copying it
> > to userspace.  The correct amount of data (length) is returned,
> > but the final entry is empty (zero, not filled in).
> > The for-loop 'i' check should use <= to allow the final struct
> > ifreq32 to be copied.  I also used the ifconf-corruption program
> > in kernel bugzilla #4746 to make sure that this change does not
> > re-introduce the corruption.
> > 
> > Signed-off-by: Randy Dunlap <rdunlap@xenotime.net>
> 
> Good catch, applied.  Thanks Randy.
> 
> Is this one relevant for -stable?

Yes, IMO.  Have to wait for it to be merged upstream, right?

---
~Randy

^ permalink raw reply

* Re: [patch 1/4] net: percpufy frequently used vars -- add percpu_counter_mod_bh
From: Andi Kleen @ 2006-03-09  2:21 UTC (permalink / raw)
  To: Andrew Morton; +Cc: bcrl, linux-kernel, davem, netdev, shai
In-Reply-To: <20060308150609.344c62fa.akpm@osdl.org>

Andrew Morton <akpm@osdl.org> writes:
> 
> x86_64 is signed 32-bit!

I'll change it. You want signed 64bit?

-Andi

^ permalink raw reply

* Re: [patch 1/4] net: percpufy frequently used vars -- add percpu_counter_mod_bh
From: Andrew Morton @ 2006-03-09  2:32 UTC (permalink / raw)
  To: Andi Kleen; +Cc: bcrl, linux-kernel, davem, netdev, shai
In-Reply-To: <p733bhswe6j.fsf@verdi.suse.de>

Andi Kleen <ak@suse.de> wrote:
>
> Andrew Morton <akpm@osdl.org> writes:
> > 
> > x86_64 is signed 32-bit!
> 
> I'll change it. You want signed 64bit?
> 

Well it's all random at present.  Since the API is defined as unsigned I
guess it's be best to make it unsigned for now.  Later, when someone gets
down to making it signed and reviewing all the users they can flip x86_64
to signed along with the rest of the architectures.

^ permalink raw reply

* Re: [patch 1/4] net: percpufy frequently used vars -- add percpu_counter_mod_bh
From: Andi Kleen @ 2006-03-09  4:14 UTC (permalink / raw)
  To: Ravikiran G Thirumalai
  Cc: Andrew Morton, bcrl, linux-kernel, davem, netdev, shai
In-Reply-To: <20060309080651.GA3599@localhost.localdomain>

On Thursday 09 March 2006 09:06, Ravikiran G Thirumalai wrote:
> On Wed, Mar 08, 2006 at 04:32:58PM -0800, Andrew Morton wrote:
> > Ravikiran G Thirumalai <kiran@scalex86.org> wrote:
> > >
> > > On Wed, Mar 08, 2006 at 03:43:21PM -0800, Andrew Morton wrote:
> > > > Benjamin LaHaise <bcrl@kvack.org> wrote:
> > > > >
> > > > > I think it may make more sense to simply convert local_t into a long, given 
> > > > > that most of the users will be things like stats counters.
> > > > > 
> > > > 
> > > > Yes, I agree that making local_t signed would be better.  It's consistent
> > > > with atomic_t, atomic64_t and atomic_long_t and it's a bit more flexible.
> > > > 
> > > > Perhaps.  A lot of applications would just be upcounters for statistics,
> > > > where unsigned is desired.  But I think the consistency argument wins out.
> > > 
> > > It already is... for most of the arches except x86_64.
> > 
> > x86 uses unsigned long.
> 
> Here's a patch making x86_64 local_t to 64 bits like other 64 bit arches.
> This keeps local_t unsigned long.  (We can change it to signed value 
> along with other arches later in one go I guess)

I already did that change in my tree
(except it's currently unsigned long as Andrew Indicated) 

-Andi


>

^ permalink raw reply

* Re: [patch 1/4] net: percpufy frequently used vars -- add percpu_counter_mod_bh
From: Ravikiran G Thirumalai @ 2006-03-09  8:06 UTC (permalink / raw)
  To: Andrew Morton; +Cc: bcrl, linux-kernel, davem, netdev, shai, Andi Kleen
In-Reply-To: <20060308163258.36f3bd79.akpm@osdl.org>

On Wed, Mar 08, 2006 at 04:32:58PM -0800, Andrew Morton wrote:
> Ravikiran G Thirumalai <kiran@scalex86.org> wrote:
> >
> > On Wed, Mar 08, 2006 at 03:43:21PM -0800, Andrew Morton wrote:
> > > Benjamin LaHaise <bcrl@kvack.org> wrote:
> > > >
> > > > I think it may make more sense to simply convert local_t into a long, given 
> > > > that most of the users will be things like stats counters.
> > > > 
> > > 
> > > Yes, I agree that making local_t signed would be better.  It's consistent
> > > with atomic_t, atomic64_t and atomic_long_t and it's a bit more flexible.
> > > 
> > > Perhaps.  A lot of applications would just be upcounters for statistics,
> > > where unsigned is desired.  But I think the consistency argument wins out.
> > 
> > It already is... for most of the arches except x86_64.
> 
> x86 uses unsigned long.

Here's a patch making x86_64 local_t to 64 bits like other 64 bit arches.
This keeps local_t unsigned long.  (We can change it to signed value 
along with other arches later in one go I guess) 

Thanks,
Kiran


Change x86_64 local_t to 64 bits like all other arches.

Signed-off-by: Ravikiran Thirumalai <kiran@scalex86.org>

Index: linux-2.6.16-rc5mm3/include/asm-x86_64/local.h
===================================================================
--- linux-2.6.16-rc5mm3.orig/include/asm-x86_64/local.h	2006-03-08 16:51:31.000000000 -0800
+++ linux-2.6.16-rc5mm3/include/asm-x86_64/local.h	2006-03-08 21:56:01.000000000 -0800
@@ -5,18 +5,18 @@
 
 typedef struct
 {
-	volatile unsigned int counter;
+	volatile long counter;
 } local_t;
 
 #define LOCAL_INIT(i)	{ (i) }
 
-#define local_read(v)	((v)->counter)
+#define local_read(v)	((unsigned long)(v)->counter)
 #define local_set(v,i)	(((v)->counter) = (i))
 
 static __inline__ void local_inc(local_t *v)
 {
 	__asm__ __volatile__(
-		"incl %0"
+		"incq %0"
 		:"=m" (v->counter)
 		:"m" (v->counter));
 }
@@ -24,7 +24,7 @@ static __inline__ void local_inc(local_t
 static __inline__ void local_dec(local_t *v)
 {
 	__asm__ __volatile__(
-		"decl %0"
+		"decq %0"
 		:"=m" (v->counter)
 		:"m" (v->counter));
 }
@@ -32,7 +32,7 @@ static __inline__ void local_dec(local_t
 static __inline__ void local_add(unsigned int i, local_t *v)
 {
 	__asm__ __volatile__(
-		"addl %1,%0"
+		"addq %1,%0"
 		:"=m" (v->counter)
 		:"ir" (i), "m" (v->counter));
 }
@@ -40,7 +40,7 @@ static __inline__ void local_add(unsigne
 static __inline__ void local_sub(unsigned int i, local_t *v)
 {
 	__asm__ __volatile__(
-		"subl %1,%0"
+		"subq %1,%0"
 		:"=m" (v->counter)
 		:"ir" (i), "m" (v->counter));
 }
@@ -71,4 +71,4 @@ static __inline__ void local_sub(unsigne
 #define __cpu_local_add(i, v)	cpu_local_add((i), (v))
 #define __cpu_local_sub(i, v)	cpu_local_sub((i), (v))
 
-#endif /* _ARCH_I386_LOCAL_H */
+#endif /* _ARCH_X8664_LOCAL_H */

^ permalink raw reply

* Re: [patch 1/4] net: percpufy frequently used vars -- add percpu_counter_mod_bh
From: Nick Piggin @ 2006-03-09  8:14 UTC (permalink / raw)
  To: Ravikiran G Thirumalai
  Cc: Andrew Morton, bcrl, linux-kernel, davem, netdev, shai,
	Andi Kleen
In-Reply-To: <20060309080651.GA3599@localhost.localdomain>

Ravikiran G Thirumalai wrote:

> Here's a patch making x86_64 local_t to 64 bits like other 64 bit arches.
> This keeps local_t unsigned long.  (We can change it to signed value 
> along with other arches later in one go I guess) 
> 

Why not just keep naming and structure of interfaces consistent with
atomic_t?

That would be signed and 32-bit. You then also have a local64_t.

-- 
SUSE Labs, Novell Inc.
Send instant messages to your online friends http://au.messenger.yahoo.com 

^ permalink raw reply

* Re: [patch 1/4] net: percpufy frequently used vars -- add percpu_counter_mod_bh
From: Ravikiran G Thirumalai @ 2006-03-09  8:22 UTC (permalink / raw)
  To: Nick Piggin
  Cc: Andrew Morton, bcrl, linux-kernel, davem, netdev, shai,
	Andi Kleen
In-Reply-To: <440FE3E2.1060307@yahoo.com.au>

On Thu, Mar 09, 2006 at 07:14:26PM +1100, Nick Piggin wrote:
> Ravikiran G Thirumalai wrote:
> 
> >Here's a patch making x86_64 local_t to 64 bits like other 64 bit arches.
> >This keeps local_t unsigned long.  (We can change it to signed value 
> >along with other arches later in one go I guess) 
> >
> 
> Why not just keep naming and structure of interfaces consistent with
> atomic_t?
> 
> That would be signed and 32-bit. You then also have a local64_t.

No, local_t is supposed to be 64-bits on 64bits arches and 32 bit on 32 bit
arches.  x86_64 was the only exception, so this patch fixes that.

^ permalink raw reply

* Re: [patch 1/4] net: percpufy frequently used vars -- add percpu_counter_mod_bh
From: Nick Piggin @ 2006-03-09  8:41 UTC (permalink / raw)
  To: Ravikiran G Thirumalai
  Cc: Andrew Morton, bcrl, linux-kernel, davem, netdev, shai,
	Andi Kleen
In-Reply-To: <20060309082251.GB3599@localhost.localdomain>

Ravikiran G Thirumalai wrote:
> On Thu, Mar 09, 2006 at 07:14:26PM +1100, Nick Piggin wrote:
> 
>>Ravikiran G Thirumalai wrote:
>>
>>
>>>Here's a patch making x86_64 local_t to 64 bits like other 64 bit arches.
>>>This keeps local_t unsigned long.  (We can change it to signed value 
>>>along with other arches later in one go I guess) 
>>>
>>
>>Why not just keep naming and structure of interfaces consistent with
>>atomic_t?
>>
>>That would be signed and 32-bit. You then also have a local64_t.
> 
> 
> No, local_t is supposed to be 64-bits on 64bits arches and 32 bit on 32 bit
> arches.  x86_64 was the only exception, so this patch fixes that.
> 
> 

Right. If it wasn't I wouldn't have proposed the change.

Considering that local_t has been broken so that basically nobody
is using it, now is a great time to rethink the types before it
gets fixed and people start using it.

And modelling the type on the atomic types would make the most
sense because everyone already knows them.

-- 
SUSE Labs, Novell Inc.
Send instant messages to your online friends http://au.messenger.yahoo.com 

^ permalink raw reply

* Re: [PATCH, RESEND] Add MWI workaround for Tulip DC21143
From: Geert Uytterhoeven @ 2006-03-09  9:37 UTC (permalink / raw)
  To: Francois Romieu
  Cc: Ralf Baechle, Martin Michlmayr, netdev, Linux/MIPS Development,
	P. Horton
In-Reply-To: <20060308224139.GA7536@electric-eye.fr.zoreil.com>

On Wed, 8 Mar 2006, Francois Romieu wrote:
> Geert Uytterhoeven <geert@linux-m68k.org> :
> > On Tue, 7 Mar 2006, Ralf Baechle wrote:
> [...]
> > > I'm just not convinced of having such a workaround as a build option.
> > > The average person building a a kernel will probably not know if the
> > > option needs to be enabled or not.
> > 
> > Indeed, if it's mentioned in the errata of the chip, the driver should take
> > care of it.
> 
> Something like the patch below (+Mr Horton Signed-off-by: and description):
> 
> diff --git a/drivers/net/tulip/tulip.h b/drivers/net/tulip/tulip.h
> index 05d2d96..d109540 100644
> --- a/drivers/net/tulip/tulip.h
> +++ b/drivers/net/tulip/tulip.h
> @@ -262,7 +262,14 @@ enum t21143_csr6_bits {
>  #define RX_RING_SIZE	128 
>  #define MEDIA_MASK     31
>  
> +/* MWI can fail on 21143 rev 65 if the receive buffer ends
> +   on a cache line boundary. Ensure it doesn't ... */
> +
> +#ifdef CONFIG_MIPS_COBALT
> +#define PKT_BUF_SZ		(1536 + 4)
> +#else
>  #define PKT_BUF_SZ		1536	/* Size of each temporary Rx buffer. */
> +#endif
>  
>  #define TULIP_MIN_CACHE_LINE	8	/* in units of 32-bit words */
>  
> diff --git a/drivers/net/tulip/tulip_core.c b/drivers/net/tulip/tulip_core.c
> index c67c912..ca6eeda 100644
> --- a/drivers/net/tulip/tulip_core.c
> +++ b/drivers/net/tulip/tulip_core.c
> @@ -294,6 +294,8 @@ static void tulip_up(struct net_device *
>  	if (tp->mii_cnt  ||  (tp->mtable  &&  tp->mtable->has_mii))
>  		iowrite32(0x00040000, ioaddr + CSR6);
>  
> +	printk(KERN_DEBUG "%s: CSR0 %08x\n", dev->name, tp->csr0);
> +
>  	/* Reset the chip, holding bit 0 set at least 50 PCI cycles. */
>  	iowrite32(0x00000001, ioaddr + CSR0);
>  	udelay(100);
> @@ -1155,8 +1157,10 @@ static void __devinit tulip_mwi_config (
>  	/* if we have any cache line size at all, we can do MRM */
>  	csr0 |= MRM;
>  
> +#ifndef CONFIG_MIPS_COBALT
>  	/* ...and barring hardware bugs, MWI */
>  	if (!(tp->chip_id == DC21143 && tp->revision == 65))
> +#endif
>  		csr0 |= MWI;

So when compiling for Cobalt, we work around the hardware bug, while for other
platforms, we just disable MWI?

Wouldn't it be possible to always (I mean, when a rev 65 chip is detected) work
around the bug?

>  	/* set or disable MWI in the standard PCI command bit.
> @@ -1182,7 +1186,7 @@ static void __devinit tulip_mwi_config (
>  	 */
>  	switch (cache) {
>  	case 8:
> -		csr0 |= MRL | (1 << CALShift) | (16 << BurstLenShift);
> +		csr0 |= MRL | (1 << CALShift) | (8 << BurstLenShift);
>  		break;
>  	case 16:
>  		csr0 |= MRL | (2 << CALShift) | (16 << BurstLenShift);

Gr{oetje,eeting}s,

						Geert

--
Geert Uytterhoeven -- There's lots of Linux beyond ia32 -- geert@linux-m68k.org

In personal conversations with technical people, I call myself a hacker. But
when I'm talking to journalists I just say "programmer" or something like that.
							    -- Linus Torvalds

^ permalink raw reply

* [UPDATED PATCH] Re: Re: [Patch 7/7] Generic netlink interface (delay accounting)
From: Balbir Singh @ 2006-03-09 14:37 UTC (permalink / raw)
  To: Shailabh Nagar; +Cc: hadi, netdev, linux-kernel, lse-tech
In-Reply-To: <440F52FF.30908@watson.ibm.com>

> Thanks for the clarification of the usage model. While our needs are 
> certainly much less complex,
> it is useful to know the range of options.
> 
> >There are no hard rules on what you need to be multicasting and as an
> >example you could send periodic(aka time based) samples from the kernel
> >on a multicast channel and that would be received by all. It did seem
> >odd that you want to have a semi-promiscous mode where a response to a
> >GET is multicast. If that is still what you want to achieve, then you
> >should.
> > 
> >>>Also if you can provide feedback whether the doc i sent was any use
> >>>and what wasnt clear etc.
> >also take a look at the excellent documentation Thomas Graf has put in
> >the kernel for all the utilities for manipulating netlink messages and
> >tell me if that should also be put in this doc (It is listed as a TODO).

Hello, Jamal,

Please find the latest version of the patch for review. The genetlink
code has been updated as per your review comments. The changelog is provided
below

1. Eliminated TASKSTATS_CMD_LISTEN and TASKSTATS_CMD_IGNORE
2. Provide generic functions called genlmsg_data() and genlmsg_len()
   in linux/net/genetlink.h
3. Do not multicast all replies, multicast only events generated due
   to task exit.
4. The taskstats and taskstats_reply structures are now 64 bit aligned.
5. Family id is dynamically generated.

Please let us know if we missed something out.

Thanks,
Balbir


Signed-off-by: Shailabh Nagar <nagar@watson.ibm.com>
Signed-off-by: Balbir Singh <balbir@in.ibm.com>

---

 include/linux/delayacct.h |    2 
 include/linux/taskstats.h |  128 ++++++++++++++++++++++++
 include/net/genetlink.h   |   20 +++
 init/Kconfig              |   16 ++-
 kernel/Makefile           |    1 
 kernel/delayacct.c        |   56 ++++++++++
 kernel/taskstats.c        |  244 ++++++++++++++++++++++++++++++++++++++++++++++
 7 files changed, 464 insertions(+), 3 deletions(-)

diff -puN include/linux/delayacct.h~delayacct-genetlink include/linux/delayacct.h
--- linux-2.6.16-rc5/include/linux/delayacct.h~delayacct-genetlink	2006-03-09 17:15:31.000000000 +0530
+++ linux-2.6.16-rc5-balbir/include/linux/delayacct.h	2006-03-09 17:15:31.000000000 +0530
@@ -15,6 +15,7 @@
 #define _LINUX_TASKDELAYS_H
 
 #include <linux/sched.h>
+#include <linux/taskstats.h>
 
 #ifdef CONFIG_TASK_DELAY_ACCT
 extern int delayacct_on;	/* Delay accounting turned on/off */
@@ -24,6 +25,7 @@ extern void __delayacct_tsk_init(struct 
 extern void __delayacct_tsk_exit(struct task_struct *);
 extern void __delayacct_blkio(void);
 extern void __delayacct_swapin(void);
+extern int delayacct_add_tsk(struct taskstats_reply *, struct task_struct *);
 
 static inline void delayacct_tsk_init(struct task_struct *tsk)
 {
diff -puN /dev/null include/linux/taskstats.h
--- /dev/null	2004-06-24 23:34:38.000000000 +0530
+++ linux-2.6.16-rc5-balbir/include/linux/taskstats.h	2006-03-09 19:28:54.000000000 +0530
@@ -0,0 +1,128 @@
+/* taskstats.h - exporting per-task statistics
+ *
+ * Copyright (C) Shailabh Nagar, IBM Corp. 2006
+ *           (C) Balbir Singh,   IBM Corp. 2006
+ *
+ * This program is free software; you can redistribute it and/or modify it
+ * under the terms of version 2.1 of the GNU Lesser General Public License
+ * as published by the Free Software Foundation.
+ *
+ * This program is distributed in the hope that it would be useful, but
+ * WITHOUT ANY WARRANTY; without even the implied warranty of
+ * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.
+ */
+
+#ifndef _LINUX_TASKSTATS_H
+#define _LINUX_TASKSTATS_H
+
+/* Format for per-task data returned to userland when
+ *	- a task exits
+ *	- listener requests stats for a task
+ *
+ * The struct is versioned. Newer versions should only add fields to
+ * the bottom of the struct to maintain backward compatibility.
+ *
+ * To create the next version, bump up the taskstats_version variable
+ * and delineate the start of newly added fields with a comment indicating
+ * the version number.
+ */
+
+#define TASKSTATS_VERSION	1
+
+struct taskstats {
+	/* Maintain 64-bit alignment while extending */
+
+	/* Version 1 */
+#define TASKSTATS_NOPID	-1
+	__s64	pid;
+	__s64	tgid;
+
+	/* XXX_count is number of delay values recorded.
+	 * XXX_total is corresponding cumulative delay in nanoseconds
+	 */
+
+#define TASKSTATS_NOCPUSTATS	1
+	__u64	cpu_count;
+	__u64	cpu_delay_total;	/* wait, while runnable, for cpu */
+	__u64	blkio_count;
+	__u64	blkio_delay_total;	/* sync,block io completion wait*/
+	__u64	swapin_count;
+	__u64	swapin_delay_total;	/* swapin page fault wait*/
+
+	__u64	cpu_run_total;		/* cpu running time
+					 * no count available/provided */
+};
+
+
+#define TASKSTATS_LISTEN_GROUP	0x1
+
+/*
+ * Commands sent from userspace
+ * Not versioned. New commands should only be inserted at the enum's end
+ */
+
+enum {
+	TASKSTATS_CMD_UNSPEC,		/* Reserved */
+	TASKSTATS_CMD_NONE,		/* Not a valid cmd to send
+					 * Marks data sent on task/tgid exit */
+	TASKSTATS_CMD_LISTEN,		/* Start listening */
+	TASKSTATS_CMD_IGNORE,		/* Stop listening */
+	TASKSTATS_CMD_PID,		/* Send stats for a pid */
+	TASKSTATS_CMD_TGID,		/* Send stats for a tgid */
+};
+
+/* Parameters for commands
+ * New parameters should only be inserted at the struct's end
+ */
+
+struct taskstats_cmd_param {
+	/* Maintain 64-bit alignment while extending */
+	union {
+		__s64	pid;
+		__s64	tgid;
+	} id;
+};
+
+enum outtype {
+	TASKSTATS_REPLY_NONE = 1,	/* Control cmd response */
+	TASKSTATS_REPLY_PID,		/* per-pid data cmd response*/
+	TASKSTATS_REPLY_TGID,		/* per-tgid data cmd response*/
+	TASKSTATS_REPLY_EXIT_PID,	/* Exiting task's stats */
+	TASKSTATS_REPLY_EXIT_TGID,	/* Exiting tgid's stats
+					 * (sent on each tid's exit) */
+};
+
+/*
+ * Reply sent from kernel
+ * Version number affects size/format of struct taskstats only
+ */
+
+struct taskstats_reply {
+	/* Maintain 64-bit alignment while extending */
+	__u16 outtype;			/* Must be one of enum outtype */
+	__u16 version;
+	__u32 err;
+	struct taskstats stats;		/* Invalid if err != 0 */
+};
+
+/* NETLINK_GENERIC related info */
+
+#define TASKSTATS_GENL_NAME	"TASKSTATS"
+#define TASKSTATS_GENL_VERSION	0x1
+
+#define TASKSTATS_HDRLEN	(NLMSG_SPACE(GENL_HDRLEN))
+#define TASKSTATS_BODYLEN	(sizeof(struct taskstats_reply))
+
+#ifdef __KERNEL__
+
+#include <linux/sched.h>
+
+#ifdef CONFIG_TASKSTATS
+extern void taskstats_exit_pid(struct task_struct *);
+#else
+static inline void taskstats_exit_pid(struct task_struct *tsk)
+{}
+#endif
+
+#endif /* __KERNEL__ */
+#endif /* _LINUX_TASKSTATS_H */
diff -puN init/Kconfig~delayacct-genetlink init/Kconfig
--- linux-2.6.16-rc5/init/Kconfig~delayacct-genetlink	2006-03-09 17:15:31.000000000 +0530
+++ linux-2.6.16-rc5-balbir/init/Kconfig	2006-03-09 17:15:31.000000000 +0530
@@ -158,11 +158,21 @@ config TASK_DELAY_ACCT
 	  in pages. Such statistics can help in setting a task's priorities
 	  relative to other tasks for cpu, io, rss limits etc.
 
-	  Unlike BSD process accounting, this information is available
-	  continuously during the lifetime of a task.
-
 	  Say N if unsure.
 
+config TASKSTATS
+	bool "Export task/process statistics through netlink (EXPERIMENTAL)"
+	depends on TASK_DELAY_ACCT
+	default y
+	help
+	  Export selected statistics for tasks/processes through the
+	  generic netlink interface. Unlike BSD process accounting, the
+	  statistics are available during the lifetime of tasks/processes as
+	  responses to commands. Like BSD accounting, they are sent to user
+	  space on task exit.
+
+	  Say Y if unsure.
+
 config SYSCTL
 	bool "Sysctl support"
 	---help---
diff -puN kernel/delayacct.c~delayacct-genetlink kernel/delayacct.c
--- linux-2.6.16-rc5/kernel/delayacct.c~delayacct-genetlink	2006-03-09 17:15:31.000000000 +0530
+++ linux-2.6.16-rc5-balbir/kernel/delayacct.c	2006-03-09 17:15:31.000000000 +0530
@@ -16,9 +16,12 @@
 #include <linux/time.h>
 #include <linux/sysctl.h>
 #include <linux/delayacct.h>
+#include <linux/taskstats.h>
+#include <linux/mutex.h>
 
 int delayacct_on = 0;		/* Delay accounting turned on/off */
 kmem_cache_t *delayacct_cache;
+static DEFINE_MUTEX(delayacct_exit_mutex);
 
 static int __init delayacct_setup_enable(char *str)
 {
@@ -51,8 +54,14 @@ void __delayacct_tsk_init(struct task_st
 
 void __delayacct_tsk_exit(struct task_struct *tsk)
 {
+	/*
+	 * Protect against racing thread group exits
+	 */
+	mutex_lock(&delayacct_exit_mutex);
+	taskstats_exit_pid(tsk);
 	kmem_cache_free(delayacct_cache, tsk->delays);
 	tsk->delays = NULL;
+	mutex_unlock(&delayacct_exit_mutex);
 }
 
 static inline nsec_t delayacct_measure(void)
@@ -97,3 +106,50 @@ void __delayacct_swapin(void)
 	current->delays->swapin_count++;
 	spin_unlock(&current->delays->lock);
 }
+
+#ifdef CONFIG_TASKSTATS
+
+int delayacct_add_tsk(struct taskstats_reply *reply, struct task_struct *tsk)
+{
+	struct taskstats *d = &reply->stats;
+	nsec_t tmp;
+	struct timespec ts;
+	unsigned long t1,t2;
+
+	if (!tsk->delays || !delayacct_on)
+		return -EINVAL;
+
+	/* zero XXX_total,non-zero XXX_count implies XXX stat overflowed */
+#ifdef CONFIG_SCHEDSTATS
+
+	tmp = (nsec_t)d->cpu_run_total ;
+	tmp += (u64)(tsk->utime+tsk->stime)*TICK_NSEC;
+	d->cpu_run_total = (tmp < (nsec_t)d->cpu_run_total)? 0:tmp;
+
+	/* No locking available for sched_info. Take snapshot first. */
+	t1 = tsk->sched_info.pcnt;
+	t2 = tsk->sched_info.run_delay;
+
+	d->cpu_count += t1;
+
+	jiffies_to_timespec(t2, &ts);
+	tmp = (nsec_t)d->cpu_delay_total + timespec_to_ns(&ts);
+	d->cpu_delay_total = (tmp < (nsec_t)d->cpu_delay_total)? 0:tmp;
+#else
+	/* Non-zero XXX_total,zero XXX_count implies XXX stat unavailable */
+	d->cpu_count = 0;
+	d->cpu_run_total = d->cpu_delay_total = TASKSTATS_NOCPUSTATS;
+#endif
+	spin_lock(&tsk->delays->lock);
+	tmp = d->blkio_delay_total + tsk->delays->blkio_delay;
+	d->blkio_delay_total = (tmp < d->blkio_delay_total)? 0:tmp;
+	tmp = d->swapin_delay_total + tsk->delays->swapin_delay;
+	d->swapin_delay_total = (tmp < d->swapin_delay_total)? 0:tmp;
+	d->blkio_count += tsk->delays->blkio_count;
+	d->swapin_count += tsk->delays->swapin_count;
+	spin_unlock(&tsk->delays->lock);
+
+	return 0;
+}
+
+#endif /* CONFIG_TASKSTATS */
diff -puN kernel/Makefile~delayacct-genetlink kernel/Makefile
--- linux-2.6.16-rc5/kernel/Makefile~delayacct-genetlink	2006-03-09 17:15:31.000000000 +0530
+++ linux-2.6.16-rc5-balbir/kernel/Makefile	2006-03-09 17:15:31.000000000 +0530
@@ -35,6 +35,7 @@ obj-$(CONFIG_GENERIC_HARDIRQS) += irq/
 obj-$(CONFIG_SECCOMP) += seccomp.o
 obj-$(CONFIG_RCU_TORTURE_TEST) += rcutorture.o
 obj-$(CONFIG_TASK_DELAY_ACCT) += delayacct.o
+obj-$(CONFIG_TASKSTATS) += taskstats.o
 
 ifneq ($(CONFIG_SCHED_NO_NO_OMIT_FRAME_POINTER),y)
 # According to Alan Modra <alan@linuxcare.com.au>, the -fno-omit-frame-pointer is
diff -puN /dev/null kernel/taskstats.c
--- /dev/null	2004-06-24 23:34:38.000000000 +0530
+++ linux-2.6.16-rc5-balbir/kernel/taskstats.c	2006-03-09 18:52:47.000000000 +0530
@@ -0,0 +1,244 @@
+/*
+ * taskstats.c - Export per-task statistics to userland
+ *
+ * Copyright (C) Shailabh Nagar, IBM Corp. 2006
+ *           (C) Balbir Singh,   IBM Corp. 2006
+ *
+ * This program is free software; you can redistribute it and/or modify
+ * it under the terms of the GNU General Public License as published by
+ * the Free Software Foundation; either version 2 of the License, or
+ * (at your option) any later version.
+ *
+ * This program is distributed in the hope that it will be useful,
+ * but WITHOUT ANY WARRANTY; without even the implied warranty of
+ * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
+ * GNU General Public License for more details.
+ *
+ */
+
+#include <linux/kernel.h>
+#include <linux/taskstats.h>
+#include <linux/delayacct.h>
+#include <net/genetlink.h>
+#include <asm/atomic.h>
+
+const int taskstats_version = TASKSTATS_VERSION;
+static DEFINE_PER_CPU(__u32, taskstats_seqnum) = { 0 };
+static int family_registered = 0;
+
+
+static struct genl_family family = {
+	.id             = GENL_ID_GENERATE,
+	.name           = TASKSTATS_GENL_NAME,
+	.version        = TASKSTATS_GENL_VERSION,
+	.hdrsize        = 0,
+	.maxattr        = 0,
+};
+
+/* Taskstat specific functions */
+static int prepare_reply(struct genl_info *info, u8 cmd,
+			 struct sk_buff **skbp, struct taskstats_reply **replyp)
+{
+	struct sk_buff *skb;
+	struct taskstats_reply *reply;
+
+	skb = nlmsg_new(TASKSTATS_HDRLEN + TASKSTATS_BODYLEN);
+	if (!skb)
+		return -ENOMEM;
+
+	if (!info) {
+		int seq = get_cpu_var(taskstats_seqnum)++;
+		put_cpu_var(taskstats_seqnum);
+
+		reply = genlmsg_put(skb, 0, seq,
+				    family.id, 0, NLM_F_REQUEST,
+				    cmd, family.version);
+	} else
+		reply = genlmsg_put(skb, info->snd_pid, info->snd_seq,
+				    family.id, 0, info->nlhdr->nlmsg_flags,
+				    info->genlhdr->cmd, family.version);
+	if (reply == NULL) {
+		nlmsg_free(skb);
+		return -EINVAL;
+	}
+	skb_put(skb, TASKSTATS_BODYLEN);
+
+	memset(reply, 0, sizeof(*reply));
+	reply->version = taskstats_version;
+	reply->err = 0;
+
+	*skbp = skb;
+	*replyp = reply;
+	return 0;
+}
+
+static int send_reply(struct sk_buff *skb, int replytype, pid_t pid, int event)
+{
+	struct genlmsghdr *genlhdr = nlmsg_data((struct nlmsghdr *)skb->data);
+	struct taskstats_reply *reply;
+	int rc;
+
+	reply = (struct taskstats_reply *)genlmsg_data(genlhdr);
+	reply->outtype = replytype;
+
+	rc = genlmsg_end(skb, reply);
+	if (rc < 0) {
+		nlmsg_free(skb);
+		return rc;
+	}
+
+	if (event)
+		return genlmsg_multicast(skb, pid, TASKSTATS_LISTEN_GROUP);
+	else
+		return genlmsg_unicast(skb, pid);
+}
+
+static inline void fill_pid(struct taskstats_reply *reply, pid_t pid,
+			    struct task_struct *pidtsk)
+{
+	int rc;
+	struct task_struct *tsk = pidtsk;
+
+	if (!pidtsk) {
+		read_lock(&tasklist_lock);
+		tsk = find_task_by_pid(pid);
+		if (!tsk) {
+			read_unlock(&tasklist_lock);
+			reply->err = EINVAL;
+			return;
+		}
+		get_task_struct(tsk);
+		read_unlock(&tasklist_lock);
+	} else
+		get_task_struct(tsk);
+
+	rc = delayacct_add_tsk(reply, tsk);
+	if (!rc) {
+		reply->stats.pid = (s64)tsk->pid;
+		reply->stats.tgid = (s64)tsk->tgid;
+	} else
+		reply->err = (rc < 0) ? -rc : rc ;
+
+	put_task_struct(tsk);
+}
+
+static int taskstats_send_pid(struct sk_buff *skb, struct genl_info *info)
+{
+	int rc;
+	struct sk_buff *rep_skb;
+	struct taskstats_reply *reply;
+	struct taskstats_cmd_param *param= info->userhdr;
+
+	rc = prepare_reply(info, info->genlhdr->cmd, &rep_skb, &reply);
+	if (rc)
+		return rc;
+	fill_pid(reply, param->id.pid, NULL);
+	return send_reply(rep_skb, TASKSTATS_REPLY_PID, info->snd_pid, 0);
+}
+
+static inline void fill_tgid(struct taskstats_reply *reply, pid_t tgid,
+			     struct task_struct *tgidtsk)
+{
+	int rc;
+	struct task_struct *tsk, *first;
+
+	first = tgidtsk;
+	read_lock(&tasklist_lock);
+	if (!first) {
+		first = find_task_by_pid(tgid);
+		if (!first) {
+			read_unlock(&tasklist_lock);
+			reply->err = EINVAL;
+			return;
+		}
+	}
+	tsk = first;
+	do {
+		rc = delayacct_add_tsk(reply, tsk);
+		if (rc)
+			break;
+	} while_each_thread(first, tsk);
+	read_unlock(&tasklist_lock);
+
+	if (!rc) {
+		reply->stats.pid = (s64)TASKSTATS_NOPID;
+		reply->stats.tgid = (s64)tgid;
+	} else
+		reply->err = (rc < 0) ? -rc : rc ;
+}
+
+static int taskstats_send_tgid(struct sk_buff *skb, struct genl_info *info)
+{
+	int rc;
+	struct sk_buff *rep_skb;
+	struct taskstats_reply *reply;
+	struct taskstats_cmd_param *param= info->userhdr;
+
+	rc = prepare_reply(info, info->genlhdr->cmd, &rep_skb, &reply);
+	if (rc)
+		return rc;
+	fill_tgid(reply, param->id.tgid, NULL);
+	return send_reply(rep_skb, TASKSTATS_REPLY_TGID, info->snd_pid, 0);
+}
+
+/* Send pid data out on exit */
+void taskstats_exit_pid(struct task_struct *tsk)
+{
+	int rc;
+	struct sk_buff *rep_skb;
+	struct taskstats_reply *reply;
+
+	/*
+	 * tasks can start to exit very early. Ensure that the family
+	 * is registered before notifications are sent out
+	 */
+	if (!family_registered)
+		return;
+
+	rc = prepare_reply(NULL, TASKSTATS_CMD_NONE, &rep_skb, &reply);
+	if (rc)
+		return;
+	fill_pid(reply, tsk->pid, tsk);
+	rc = send_reply(rep_skb, TASKSTATS_REPLY_EXIT_PID, 0, 1);
+
+	if (rc || thread_group_empty(tsk))
+		return;
+
+	/* Send tgid data too */
+	rc = prepare_reply(NULL, TASKSTATS_CMD_NONE, &rep_skb, &reply);
+	if (rc)
+		return;
+	fill_tgid(reply, tsk->tgid, tsk);
+	send_reply(rep_skb, TASKSTATS_REPLY_EXIT_TGID, 0, 1);
+}
+
+static struct genl_ops pid_ops = {
+	.cmd            = TASKSTATS_CMD_PID,
+	.doit           = taskstats_send_pid,
+};
+
+static struct genl_ops tgid_ops = {
+	.cmd            = TASKSTATS_CMD_TGID,
+	.doit           = taskstats_send_tgid,
+};
+
+static int __init taskstats_init(void)
+{
+	if (genl_register_family(&family))
+		return -EFAULT;
+	family_registered = 1;
+
+	if (genl_register_ops(&family, &pid_ops))
+		goto err;
+	if (genl_register_ops(&family, &tgid_ops))
+		goto err;
+
+	return 0;
+err:
+	genl_unregister_family(&family);
+	family_registered = 0;
+	return -EFAULT;
+}
+
+late_initcall(taskstats_init);
+
diff -puN include/net/genetlink.h~delayacct-genetlink include/net/genetlink.h
--- linux-2.6.16-rc5/include/net/genetlink.h~delayacct-genetlink	2006-03-09 17:15:31.000000000 +0530
+++ linux-2.6.16-rc5-balbir/include/net/genetlink.h	2006-03-09 17:48:39.000000000 +0530
@@ -150,4 +150,24 @@ static inline int genlmsg_unicast(struct
 	return nlmsg_unicast(genl_sock, skb, pid);
 }
 
+/**
+ * gennlmsg_data - head of message payload
+ * @gnlh: genetlink messsage header
+ */
+static inline void *genlmsg_data(const struct genlmsghdr *gnlh)
+{
+	return ((unsigned char *) gnlh + GENL_HDRLEN);
+}
+
+/**
+ * genlmsg_len - length of message payload
+ * @gnlh: genetlink message header
+ */
+static inline int genlmsg_len(const struct genlmsghdr *gnlh)
+{
+	struct nlmsghdr *nlh = (struct nlmsghdr *)((unsigned char *)gnlh -
+						    NLMSG_HDRLEN);
+	return (nlh->nlmsg_len - GENL_HDRLEN - NLMSG_HDRLEN);
+}
+
 #endif	/* __NET_GENERIC_NETLINK_H */
_


-------------------------------------------------------
This SF.Net email is sponsored by xPML, a groundbreaking scripting language
that extends applications into web and mobile media. Attend the live webcast
and join the prime developer group breaking into this new coding territory!
http://sel.as-us.falkag.net/sel?cmd=lnk&kid=110944&bid=241720&dat=121642

^ permalink raw reply

* Re: [UPDATED PATCH] Re: Re: [Patch 7/7] Generic netlink interface (delay accounting)
From: Shailabh Nagar @ 2006-03-09 16:06 UTC (permalink / raw)
  To: balbir; +Cc: hadi, netdev, linux-kernel, lse-tech
In-Reply-To: <20060309143759.GA4653@in.ibm.com>

Balbir Singh wrote:

<snip>

>Hello, Jamal,
>
>Please find the latest version of the patch for review. The genetlink
>code has been updated as per your review comments. The changelog is provided
>below
>
>1. Eliminated TASKSTATS_CMD_LISTEN and TASKSTATS_CMD_IGNORE
>2. Provide generic functions called genlmsg_data() and genlmsg_len()
>   in linux/net/genetlink.h
>  
>
Balbir,
it might be a good idea to split 2. out separately, since it has generic 
value beyond the
delay accounting patches (just like we did for the timespec_diff_ns change)


Thanks,
Shailabh

>3. Do not multicast all replies, multicast only events generated due
>   to task exit.
>4. The taskstats and taskstats_reply structures are now 64 bit aligned.
>5. Family id is dynamically generated.
>
>Please let us know if we missed something out.
>
>Thanks,
>Balbir
>
>
>Signed-off-by: Shailabh Nagar <nagar@watson.ibm.com>
>Signed-off-by: Balbir Singh <balbir@in.ibm.com>
>
>---
>
> include/linux/delayacct.h |    2 
> include/linux/taskstats.h |  128 ++++++++++++++++++++++++
> include/net/genetlink.h   |   20 +++
> init/Kconfig              |   16 ++-
> kernel/Makefile           |    1 
> kernel/delayacct.c        |   56 ++++++++++
> kernel/taskstats.c        |  244 ++++++++++++++++++++++++++++++++++++++++++++++
> 7 files changed, 464 insertions(+), 3 deletions(-)
>
>  
>
<snip>



-------------------------------------------------------
This SF.Net email is sponsored by xPML, a groundbreaking scripting language
that extends applications into web and mobile media. Attend the live webcast
and join the prime developer group breaking into this new coding territory!
http://sel.as-us.falkag.net/sel?cmd=lnk&kid=110944&bid=241720&dat=121642

^ permalink raw reply

* Re: [patch 1/4] net: percpufy frequently used vars -- add percpu_counter_mod_bh
From: Benjamin LaHaise @ 2006-03-09 18:39 UTC (permalink / raw)
  To: Nick Piggin
  Cc: Ravikiran G Thirumalai, Andrew Morton, linux-kernel, davem,
	netdev, shai, Andi Kleen
In-Reply-To: <440FEA24.3060307@yahoo.com.au>

On Thu, Mar 09, 2006 at 07:41:08PM +1100, Nick Piggin wrote:
> Considering that local_t has been broken so that basically nobody
> is using it, now is a great time to rethink the types before it
> gets fixed and people start using it.

I'm starting to get more concerned as the per-cpu changes that keep adding 
more overhead is getting out of hand.

> And modelling the type on the atomic types would make the most
> sense because everyone already knows them.

Except that the usage models are different; local_t is most likely to be 
used for cpu local statistics or for sequences, where making them signed 
is a bit backwards.

		-ben
-- 
"Time is of no importance, Mr. President, only life is important."
Don't Email: <dont@kvack.org>.

^ permalink raw reply

* Re: [RFC PATCH] softmac: (v2) send WEXT assoc/disassoc events to userspace
From: Larry Finger @ 2006-03-09 20:36 UTC (permalink / raw)
  To: Dan Williams
  Cc: netdev-u79uwXL29TY76Z2rM5mHXA, bcm43xx-dev-0fE9KPoRgkgATYTw5x5z8w,
	softmac-dev-cdvu00un1VgdHxzADdlk8Q, Denis Vlasenko,
	David Woodhouse
In-Reply-To: <1141935893.28038.2.camel-bi+AKbBUZKY6gyzm1THtWbp2dZbC/Bob@public.gmane.org>

Dan Williams wrote:
> Completely untested, not entirely sure it compiles.  For whatever
> reason, softmac is sending custom events to userspace already, but it
> should _really_ be sending the right WEXT events instead.  Comments?  If
> this looks good, please apply it.
> 
> Signed-off-by: Dan Williams <dcbw-H+wXaHxf7aLQT0dZR+AlfA@public.gmane.org>
> 

V2 compiles cleanly. It still doesn't authenticate with my WRT54G V5, but that must be another problem.

Larry

^ permalink raw reply

* Re: [PATCH, RESEND] Add MWI workaround for Tulip DC21143
From: Francois Romieu @ 2006-03-09 22:44 UTC (permalink / raw)
  To: Geert Uytterhoeven
  Cc: Ralf Baechle, Martin Michlmayr, netdev, Linux/MIPS Development,
	P. Horton
In-Reply-To: <Pine.LNX.4.62.0603091032490.9741@pademelon.sonytel.be>

Geert Uytterhoeven <geert@linux-m68k.org> :
[...]
> So when compiling for Cobalt, we work around the hardware bug, while for other
> platforms, we just disable MWI?
> 
> Wouldn't it be possible to always (I mean, when a rev 65 chip is detected)
> work around the bug?

Of course it is possible but it is not the same semantic as the initial
patch (not that I know if it is right or not).

So:
- does the issue exist beyond Cobalt hosts ?
- is the fix Cobalt-only ?

-- 
Ueimor

^ permalink raw reply

* [RFC: 2.6 patch] hostap_hw.c:hfa384x_set_rid(): fix error handling
From: Adrian Bunk @ 2006-03-09 23:06 UTC (permalink / raw)
  To: jkmaline; +Cc: hostap, linux-kernel, netdev, linville

The Coverity checker noted that the call to prism2_hw_reset() was dead 
code.

Does this patch change the code to what was intended?


Signed-off-by: Adrian Bunk <bunk@stusta.de>

--- linux-2.6.16-rc5-mm3-full/drivers/net/wireless/hostap/hostap_hw.c.old	2006-03-09 23:28:30.000000000 +0100
+++ linux-2.6.16-rc5-mm3-full/drivers/net/wireless/hostap/hostap_hw.c	2006-03-09 23:30:19.000000000 +0100
@@ -928,16 +928,16 @@ static int hfa384x_set_rid(struct net_de
 
 	res = hfa384x_cmd(dev, HFA384X_CMDCODE_ACCESS_WRITE, rid, NULL, NULL);
 	up(&local->rid_bap_sem);
+
 	if (res) {
+		if (res == -ETIMEDOUT)
+			prism2_hw_reset(dev);
+
 		printk(KERN_DEBUG "%s: hfa384x_set_rid: CMDCODE_ACCESS_WRITE "
 		       "failed (res=%d, rid=%04x, len=%d)\n",
 		       dev->name, res, rid, len);
-		return res;
 	}
 
-	if (res == -ETIMEDOUT)
-		prism2_hw_reset(dev);
-
 	return res;
 }
 

^ permalink raw reply

page: next (older) | prev (newer) | latest
- recent:[subjects (threaded)|topics (new)|topics (active)]

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox