public inbox for linux-kernel@vger.kernel.org
 help / color / mirror / Atom feed
* [PATCH] Fix 2.5.3pre reiserfs BUG() at boot time
@ 2002-01-25 17:28 Andi Kleen
  2002-01-25 18:01 ` John Levon
  0 siblings, 1 reply; 20+ messages in thread
From: Andi Kleen @ 2002-01-25 17:28 UTC (permalink / raw)
  To: torvalds; +Cc: linux-kernel, davej

[resent because t-offline's outgoing mail server seems to be eating/
not delivering mail. sorry if it appears multiple times]

This patch fixes an reiserfs BUG() at boot time introduced by the 
inode cleanups. The problem is that it passes a 20 char name to
kmem_cache_create() ("reiserfs_inode_cache") but kmem_cache_create()
only allows 19 character names and BUG()s for longer names.

The patch fixes this in a low tech approach.  It's required to boot
a 2.5.3preX machine with reiserfs compiled in.

-Andi


Index: fs/reiserfs/super.c
===================================================================
RCS file: /cvs/linux/fs/reiserfs/super.c,v
retrieving revision 1.17
diff -u -u -r1.17 super.c
--- fs/reiserfs/super.c	2002/01/24 14:07:54	1.17
+++ fs/reiserfs/super.c	2002/01/24 22:03:34
@@ -153,7 +153,7 @@
  
 static int init_inodecache(void)
 {
-	reiserfs_inode_cachep = kmem_cache_create("reiserfs_inode_cache",
+	reiserfs_inode_cachep = kmem_cache_create("reiser_inode_cache",
 					     sizeof(struct reiserfs_inode_info),
 					     0, SLAB_HWCACHE_ALIGN,
 					     init_once, NULL);

^ permalink raw reply	[flat|nested] 20+ messages in thread

* Re: [PATCH] Fix 2.5.3pre reiserfs BUG() at boot time
  2002-01-25 17:28 [PATCH] Fix 2.5.3pre reiserfs BUG() at boot time Andi Kleen
@ 2002-01-25 18:01 ` John Levon
  2002-01-25 18:08   ` Linus Torvalds
  0 siblings, 1 reply; 20+ messages in thread
From: John Levon @ 2002-01-25 18:01 UTC (permalink / raw)
  To: Andi Kleen; +Cc: torvalds, linux-kernel, davej

On Fri, Jan 25, 2002 at 06:28:08PM +0100, Andi Kleen wrote:

> kmem_cache_create() ("reiserfs_inode_cache") but kmem_cache_create()
> only allows 19 character names and BUG()s for longer names.

please apply this too then.

regards
john


--- mm/slab.c.old	Fri Jan 25 17:54:11 2002
+++ mm/slab.c	Fri Jan 25 17:55:18 2002
@@ -599,7 +599,10 @@
  * @dtor: A destructor for the objects.
  *
  * Returns a ptr to the cache on success, NULL on failure.
- * Cannot be called within a int, but can be interrupted.
+ * Cannot be called within a interrupt, but can be interrupted.
+ *
+ * strlen(@name) must be less than %CACHE_NAMELEN.
+ * 
  * The @ctor is run when new pages are allocated by the cache
  * and the @dtor is run before the pages are handed back.
  * The flags are

^ permalink raw reply	[flat|nested] 20+ messages in thread

* Re: [PATCH] Fix 2.5.3pre reiserfs BUG() at boot time
  2002-01-25 18:01 ` John Levon
@ 2002-01-25 18:08   ` Linus Torvalds
  2002-01-25 18:31     ` Andreas Dilger
                       ` (2 more replies)
  0 siblings, 3 replies; 20+ messages in thread
From: Linus Torvalds @ 2002-01-25 18:08 UTC (permalink / raw)
  To: John Levon; +Cc: Andi Kleen, linux-kernel, davej


On Fri, 25 Jan 2002, John Levon wrote:
>
> please apply this too then.

I would prefer instead just avoiding the copy altogether, and just save
the name pointer - with no length restrictions.

Right now the code has the comment

   /* Copy name over so we don't have problems with unloaded modules */

but that was written before "kmem_cache_destroy()" existed, and we should
long ago have fixed any modules that don't properly destroy their caches
when they exit (and yes, I know the difference between "should" and "did",
but that's not an excuse for a bad interface).

		Linus


^ permalink raw reply	[flat|nested] 20+ messages in thread

* Re: [PATCH] Fix 2.5.3pre reiserfs BUG() at boot time
  2002-01-25 18:08   ` Linus Torvalds
@ 2002-01-25 18:31     ` Andreas Dilger
  2002-01-25 18:46     ` Hans Reiser
  2002-01-25 19:49     ` Andi Kleen
  2 siblings, 0 replies; 20+ messages in thread
From: Andreas Dilger @ 2002-01-25 18:31 UTC (permalink / raw)
  To: Linus Torvalds; +Cc: John Levon, Andi Kleen, linux-kernel, davej

On Jan 25, 2002  10:08 -0800, Linus Torvalds wrote:
> I would prefer instead just avoiding the copy altogether, and just save
> the name pointer - with no length restrictions.
> 
> Right now the code has the comment
> 
>    /* Copy name over so we don't have problems with unloaded modules */

Yes, I put that in.

> but that was written before "kmem_cache_destroy()" existed, and we should
> long ago have fixed any modules that don't properly destroy their caches
> when they exit (and yes, I know the difference between "should" and "did",
> but that's not an excuse for a bad interface).

The problem is that if, for some reason, the cache is NOT empty when you
call kmem_cache_destroy(), it will not be freed, but the module exits
anyways.  Then, any access to /proc/slabinfo will OOPS.

Yes, code should be written correctly so that its slab is empty when it
exits, but I'd rather have a _bit_ of safety here so that you can at
least check slabinfo when you get a kernel message "slab is not empty"
(or whatever it is) so you can at least try and investigate the problem.

The other alternative is to BUG with enough information to figure out
the status of this cache if you try to free a non-empty cache.  At
least then you would get some data at the time the real problem happens
as opposed to killing some random process later that tries to read
slabinfo.

Cheers, Andreas
--
Andreas Dilger
http://sourceforge.net/projects/ext2resize/
http://www-mddsp.enel.ucalgary.ca/People/adilger/


^ permalink raw reply	[flat|nested] 20+ messages in thread

* Re: [PATCH] Fix 2.5.3pre reiserfs BUG() at boot time
  2002-01-25 18:08   ` Linus Torvalds
  2002-01-25 18:31     ` Andreas Dilger
@ 2002-01-25 18:46     ` Hans Reiser
  2002-01-25 19:49     ` Andi Kleen
  2 siblings, 0 replies; 20+ messages in thread
From: Hans Reiser @ 2002-01-25 18:46 UTC (permalink / raw)
  To: Linus Torvalds; +Cc: John Levon, Andi Kleen, linux-kernel, davej

We are preparing a large set of patches (each to be sent as a separate 
email per your SOP, but all tested together in a thorough testing 
procedure), which includes this patch, plus the kdev fix, plus all 2.4 
fixes.  We expect to send it out on monday, we just found a bug in the 
patches as a set that was not present in them individually.

Hans



^ permalink raw reply	[flat|nested] 20+ messages in thread

* Re: [PATCH] Fix 2.5.3pre reiserfs BUG() at boot time
  2002-01-25 18:08   ` Linus Torvalds
  2002-01-25 18:31     ` Andreas Dilger
  2002-01-25 18:46     ` Hans Reiser
@ 2002-01-25 19:49     ` Andi Kleen
  2002-01-25 20:38       ` Andreas Dilger
  2 siblings, 1 reply; 20+ messages in thread
From: Andi Kleen @ 2002-01-25 19:49 UTC (permalink / raw)
  To: Linus Torvalds; +Cc: John Levon, Andi Kleen, linux-kernel, davej

On Fri, Jan 25, 2002 at 10:08:56AM -0800, Linus Torvalds wrote:
> 
> On Fri, 25 Jan 2002, John Levon wrote:
> >
> > please apply this too then.
> 
> I would prefer instead just avoiding the copy altogether, and just save
> the name pointer - with no length restrictions.
> 
> Right now the code has the comment
> 
>    /* Copy name over so we don't have problems with unloaded modules */
> 
> but that was written before "kmem_cache_destroy()" existed, and we should
> long ago have fixed any modules that don't properly destroy their caches
> when they exit (and yes, I know the difference between "should" and "did",
> but that's not an excuse for a bad interface).

This patch implements your suggestion. It works for me, but I haven't audited
the whole tree if they do kmem_cache_destroy as needed. It tries to avoid
oopses for unmapped or bogus names on /proc/slabinfo reading at least. Also 
fixes an warning in slab.c

-Andi

Index: mm/slab.c
===================================================================
RCS file: /cvs/linux/mm/slab.c,v
retrieving revision 1.66
diff -u -u -r1.66 slab.c
--- mm/slab.c	2002/01/08 16:00:20	1.66
+++ mm/slab.c	2002/01/25 19:47:20
@@ -186,8 +186,6 @@
  * manages a cache.
  */
 
-#define CACHE_NAMELEN	20	/* max name length for a slab cache */
-
 struct kmem_cache_s {
 /* 1) each alloc & free */
 	/* full, partial first, then free */
@@ -225,7 +223,7 @@
 	unsigned long		failures;
 
 /* 3) cache creation/removal */
-	char			name[CACHE_NAMELEN];
+	const char			*name;
 	struct list_head	next;
 #ifdef CONFIG_SMP
 /* 4) per-cpu data */
@@ -335,6 +333,7 @@
 	kmem_cache_t	*cs_dmacachep;
 } cache_sizes_t;
 
+/* These are the default caches for kmalloc. Custom caches can have other sizes. */
 static cache_sizes_t cache_sizes[] = {
 #if PAGE_SIZE == 4096
 	{    32,	NULL, NULL},
@@ -353,6 +352,26 @@
 	{131072,	NULL, NULL},
 	{     0,	NULL, NULL}
 };
+/* Must match cache_sizes above. Out of line to keep cache footprint low. */
+#define CN(x) { x, x ## "(DMA)" }
+static char cache_names[][2][18] = { 
+#if PAGE_SIZE == 4096
+	CN("size-32"),
+#endif
+	CN("size-64"),
+	CN("size-128"),
+	CN("size-256"),
+	CN("size-512"),
+	CN("size-1024"),
+	CN("size-2048"),
+	CN("size-4096"),
+	CN("size-8192"),
+	CN("size-16384"),
+	CN("size-32768"),
+	CN("size-65536"),
+	CN("size-131072")
+}; 
+#undef CN
 
 /* internal cache of cache description objs */
 static kmem_cache_t cache_cache = {
@@ -437,7 +456,6 @@
 void __init kmem_cache_sizes_init(void)
 {
 	cache_sizes_t *sizes = cache_sizes;
-	char name[20];
 	/*
 	 * Fragmentation resistance on low memory - only use bigger
 	 * page orders on machines with more than 32MB of memory.
@@ -450,9 +468,9 @@
 		 * eliminates "false sharing".
 		 * Note for systems short on memory removing the alignment will
 		 * allow tighter packing of the smaller caches. */
-		sprintf(name,"size-%Zd",sizes->cs_size);
 		if (!(sizes->cs_cachep =
-			kmem_cache_create(name, sizes->cs_size,
+			kmem_cache_create(cache_names[sizes-cache_sizes][0], 
+					  sizes->cs_size,
 					0, SLAB_HWCACHE_ALIGN, NULL, NULL))) {
 			BUG();
 		}
@@ -462,9 +480,10 @@
 			offslab_limit = sizes->cs_size-sizeof(slab_t);
 			offslab_limit /= 2;
 		}
-		sprintf(name, "size-%Zd(DMA)",sizes->cs_size);
-		sizes->cs_dmacachep = kmem_cache_create(name, sizes->cs_size, 0,
-			      SLAB_CACHE_DMA|SLAB_HWCACHE_ALIGN, NULL, NULL);
+		sizes->cs_dmacachep = kmem_cache_create(
+		    cache_names[sizes-cache_sizes][1], 
+			sizes->cs_size, 0,
+			SLAB_CACHE_DMA|SLAB_HWCACHE_ALIGN, NULL, NULL);
 		if (!sizes->cs_dmacachep)
 			BUG();
 		sizes++;
@@ -604,6 +623,11 @@
  * Cannot be called within a int, but can be interrupted.
  * The @ctor is run when new pages are allocated by the cache
  * and the @dtor is run before the pages are handed back.
+ *
+ * @name must be valid until the cache is destroyed. This implies that
+ * the module calling this has to destroy the cache before getting 
+ * unloaded.
+ * 
  * The flags are
  *
  * %SLAB_POISON - Poison the slab with a known test pattern (a5a5a5a5)
@@ -632,7 +656,6 @@
 	 * Sanity checks... these are all serious usage bugs.
 	 */
 	if ((!name) ||
-		((strlen(name) >= CACHE_NAMELEN - 1)) ||
 		in_interrupt() ||
 		(size < BYTES_PER_WORD) ||
 		(size > (1<<MAX_OBJ_ORDER)*PAGE_SIZE) ||
@@ -797,8 +820,7 @@
 		cachep->slabp_cache = kmem_find_general_cachep(slab_size,0);
 	cachep->ctor = ctor;
 	cachep->dtor = dtor;
-	/* Copy name over so we don't have problems with unloaded modules */
-	strcpy(cachep->name, name);
+	cachep->name = name;
 
 #ifdef CONFIG_SMP
 	if (g_cpucache_up)
@@ -810,11 +832,8 @@
 		struct list_head *p;
 
 		list_for_each(p, &cache_chain) {
-			kmem_cache_t *pc = list_entry(p, kmem_cache_t, next);
-
-			/* The name field is constant - no lock needed. */
-			if (!strcmp(pc->name, name))
-				BUG();
+			kmem_cache_t *pc;
+			pc = list_entry(p, kmem_cache_t, next);
 		}
 	}
 
@@ -1878,6 +1897,7 @@
 		unsigned long	num_objs;
 		unsigned long	active_slabs = 0;
 		unsigned long	num_slabs;
+		const char *name; 
 		cachep = list_entry(p, kmem_cache_t, next);
 
 		spin_lock_irq(&cachep->spinlock);
@@ -1906,8 +1926,15 @@
 		num_slabs+=active_slabs;
 		num_objs = num_slabs*cachep->num;
 
+		name = cachep->name; 
+		{
+		char tmp; 
+		if (get_user(tmp, name)) 
+			name = "broken"; 
+		} 	
+
 		len += sprintf(page+len, "%-17s %6lu %6lu %6u %4lu %4lu %4u",
-			cachep->name, active_objs, num_objs, cachep->objsize,
+			name, active_objs, num_objs, cachep->objsize,
 			active_slabs, num_slabs, (1<<cachep->gfporder));
 
 #if STATS



^ permalink raw reply	[flat|nested] 20+ messages in thread

* Re: [PATCH] Fix 2.5.3pre reiserfs BUG() at boot time
  2002-01-25 19:49     ` Andi Kleen
@ 2002-01-25 20:38       ` Andreas Dilger
  2002-01-25 22:15         ` Andi Kleen
  0 siblings, 1 reply; 20+ messages in thread
From: Andreas Dilger @ 2002-01-25 20:38 UTC (permalink / raw)
  To: Andi Kleen; +Cc: Linus Torvalds, John Levon, linux-kernel, davej

On Jan 25, 2002  20:49 +0100, Andi Kleen wrote:
> @@ -810,11 +832,8 @@
>  		struct list_head *p;
>  
>  		list_for_each(p, &cache_chain) {
> -			kmem_cache_t *pc = list_entry(p, kmem_cache_t, next);
> -
> -			/* The name field is constant - no lock needed. */
> -			if (!strcmp(pc->name, name))
> -				BUG();
> +			kmem_cache_t *pc;
> +			pc = list_entry(p, kmem_cache_t, next);
>  		}
>  	}
>  

So, what exactly does the above do now (hint: p and pc are both local
so they cannot be referenced anywhere else)?  It used to check that you
weren't trying to add two caches with the same name.  This isn't
possible with caches from broken modules anymore as they have no name.

In the end, it is mostly irrelevant if we have duplicate names in the
slab cache, because you can't "attach" to a cache by name (you can
only "create" a cache and access it via a pointer).  We may as well
just remove the whole loop above, since it doesn't do anything anymore.

> +		name = cachep->name; 
> +		{
> +		char tmp; 
> +		if (get_user(tmp, name)) 
> +			name = "broken"; 
> +		} 	

When calling kmem_cache_destroy() on a non-empty slab we should just
malloc some memory with the old cache name + "_leaked" for the name
pointer.  At least then we have a sane chance of figuring out what caused
the problem, instead of having a bunch of "broken" entries in the table,
and remove the above "broken" check entirely (we will always have a name).

Cheers, Andreas
--
Andreas Dilger
http://sourceforge.net/projects/ext2resize/
http://www-mddsp.enel.ucalgary.ca/People/adilger/


^ permalink raw reply	[flat|nested] 20+ messages in thread

* Re: [PATCH] Fix 2.5.3pre reiserfs BUG() at boot time
  2002-01-25 20:38       ` Andreas Dilger
@ 2002-01-25 22:15         ` Andi Kleen
  2002-01-25 22:32           ` eth0: NULL pointer encountered in RX ring, skipping Andrea Ferraris
                             ` (3 more replies)
  0 siblings, 4 replies; 20+ messages in thread
From: Andi Kleen @ 2002-01-25 22:15 UTC (permalink / raw)
  To: Andi Kleen, Linus Torvalds, John Levon, linux-kernel, davej

On Fri, Jan 25, 2002 at 01:38:14PM -0700, Andreas Dilger wrote:
> So, what exactly does the above do now (hint: p and pc are both local
> so they cannot be referenced anywhere else)?  It used to check that you
> weren't trying to add two caches with the same name.  This isn't
> possible with caches from broken modules anymore as they have no name.

I have fixed the loop now to check for names again.

> When calling kmem_cache_destroy() on a non-empty slab we should just
> malloc some memory with the old cache name + "_leaked" for the name
> pointer.  At least then we have a sane chance of figuring out what caused
> the problem, instead of having a bunch of "broken" entries in the table,
> and remove the above "broken" check entirely (we will always have a name).

I don't like this because it complicates the code too much. 
"broken" should be enough to debug it. 

New patch appended. Linus please apply if you didn't already. 

-Andi


Index: mm/slab.c
===================================================================
RCS file: /cvs/linux/mm/slab.c,v
retrieving revision 1.66
diff -u -u -r1.66 slab.c
--- mm/slab.c	2002/01/08 16:00:20	1.66
+++ mm/slab.c	2002/01/25 22:14:40
@@ -186,8 +186,6 @@
  * manages a cache.
  */
 
-#define CACHE_NAMELEN	20	/* max name length for a slab cache */
-
 struct kmem_cache_s {
 /* 1) each alloc & free */
 	/* full, partial first, then free */
@@ -225,7 +223,7 @@
 	unsigned long		failures;
 
 /* 3) cache creation/removal */
-	char			name[CACHE_NAMELEN];
+	const char			*name;
 	struct list_head	next;
 #ifdef CONFIG_SMP
 /* 4) per-cpu data */
@@ -335,6 +333,7 @@
 	kmem_cache_t	*cs_dmacachep;
 } cache_sizes_t;
 
+/* These are the default caches for kmalloc. Custom caches can have other sizes. */
 static cache_sizes_t cache_sizes[] = {
 #if PAGE_SIZE == 4096
 	{    32,	NULL, NULL},
@@ -353,6 +352,26 @@
 	{131072,	NULL, NULL},
 	{     0,	NULL, NULL}
 };
+/* Must match cache_sizes above. Out of line to keep cache footprint low. */
+#define CN(x) { x, x ## "(DMA)" }
+static char cache_names[][2][18] = { 
+#if PAGE_SIZE == 4096
+	CN("size-32"),
+#endif
+	CN("size-64"),
+	CN("size-128"),
+	CN("size-256"),
+	CN("size-512"),
+	CN("size-1024"),
+	CN("size-2048"),
+	CN("size-4096"),
+	CN("size-8192"),
+	CN("size-16384"),
+	CN("size-32768"),
+	CN("size-65536"),
+	CN("size-131072")
+}; 
+#undef CN
 
 /* internal cache of cache description objs */
 static kmem_cache_t cache_cache = {
@@ -437,7 +456,6 @@
 void __init kmem_cache_sizes_init(void)
 {
 	cache_sizes_t *sizes = cache_sizes;
-	char name[20];
 	/*
 	 * Fragmentation resistance on low memory - only use bigger
 	 * page orders on machines with more than 32MB of memory.
@@ -450,9 +468,9 @@
 		 * eliminates "false sharing".
 		 * Note for systems short on memory removing the alignment will
 		 * allow tighter packing of the smaller caches. */
-		sprintf(name,"size-%Zd",sizes->cs_size);
 		if (!(sizes->cs_cachep =
-			kmem_cache_create(name, sizes->cs_size,
+			kmem_cache_create(cache_names[sizes-cache_sizes][0], 
+					  sizes->cs_size,
 					0, SLAB_HWCACHE_ALIGN, NULL, NULL))) {
 			BUG();
 		}
@@ -462,9 +480,10 @@
 			offslab_limit = sizes->cs_size-sizeof(slab_t);
 			offslab_limit /= 2;
 		}
-		sprintf(name, "size-%Zd(DMA)",sizes->cs_size);
-		sizes->cs_dmacachep = kmem_cache_create(name, sizes->cs_size, 0,
-			      SLAB_CACHE_DMA|SLAB_HWCACHE_ALIGN, NULL, NULL);
+		sizes->cs_dmacachep = kmem_cache_create(
+		    cache_names[sizes-cache_sizes][1], 
+			sizes->cs_size, 0,
+			SLAB_CACHE_DMA|SLAB_HWCACHE_ALIGN, NULL, NULL);
 		if (!sizes->cs_dmacachep)
 			BUG();
 		sizes++;
@@ -604,6 +623,11 @@
  * Cannot be called within a int, but can be interrupted.
  * The @ctor is run when new pages are allocated by the cache
  * and the @dtor is run before the pages are handed back.
+ *
+ * @name must be valid until the cache is destroyed. This implies that
+ * the module calling this has to destroy the cache before getting 
+ * unloaded.
+ * 
  * The flags are
  *
  * %SLAB_POISON - Poison the slab with a known test pattern (a5a5a5a5)
@@ -632,7 +656,6 @@
 	 * Sanity checks... these are all serious usage bugs.
 	 */
 	if ((!name) ||
-		((strlen(name) >= CACHE_NAMELEN - 1)) ||
 		in_interrupt() ||
 		(size < BYTES_PER_WORD) ||
 		(size > (1<<MAX_OBJ_ORDER)*PAGE_SIZE) ||
@@ -797,8 +820,7 @@
 		cachep->slabp_cache = kmem_find_general_cachep(slab_size,0);
 	cachep->ctor = ctor;
 	cachep->dtor = dtor;
-	/* Copy name over so we don't have problems with unloaded modules */
-	strcpy(cachep->name, name);
+	cachep->name = name;
 
 #ifdef CONFIG_SMP
 	if (g_cpucache_up)
@@ -811,10 +833,11 @@
 
 		list_for_each(p, &cache_chain) {
 			kmem_cache_t *pc = list_entry(p, kmem_cache_t, next);
-
-			/* The name field is constant - no lock needed. */
-			if (!strcmp(pc->name, name))
-				BUG();
+			char tmp;
+			if (get_user(tmp,pc->name)) 
+				continue; 
+			if (!strcmp(pc->name,name))
+				BUG(); 
 		}
 	}
 
@@ -1878,6 +1901,7 @@
 		unsigned long	num_objs;
 		unsigned long	active_slabs = 0;
 		unsigned long	num_slabs;
+		const char *name; 
 		cachep = list_entry(p, kmem_cache_t, next);
 
 		spin_lock_irq(&cachep->spinlock);
@@ -1906,8 +1930,15 @@
 		num_slabs+=active_slabs;
 		num_objs = num_slabs*cachep->num;
 
+		name = cachep->name; 
+		{
+		char tmp; 
+		if (get_user(tmp, name)) 
+			name = "broken"; 
+		} 	
+
 		len += sprintf(page+len, "%-17s %6lu %6lu %6u %4lu %4lu %4u",
-			cachep->name, active_objs, num_objs, cachep->objsize,
+			name, active_objs, num_objs, cachep->objsize,
 			active_slabs, num_slabs, (1<<cachep->gfporder));
 
 #if STATS

^ permalink raw reply	[flat|nested] 20+ messages in thread

* eth0: NULL pointer encountered in RX ring, skipping
  2002-01-25 22:15         ` Andi Kleen
@ 2002-01-25 22:32           ` Andrea Ferraris
  2002-01-25 22:59             ` Jeff Garzik
  2002-01-25 22:41           ` [PATCH] Fix 2.5.3pre reiserfs BUG() at boot time Andreas Dilger
                             ` (2 subsequent siblings)
  3 siblings, 1 reply; 20+ messages in thread
From: Andrea Ferraris @ 2002-01-25 22:32 UTC (permalink / raw)
  To: linux-kernel

After such kernel message the network interface (kernel 2.4.2) replied to 
ping from other network's machines with times varying between 200 and 2000 ms
and it was almost impossible to use the network to transferring files to or 
from this PC.
The card is a SIS960 on the mboard on an Acer PC with a Celeron 666 Mhz 
processor.

The problem was solved with a shutdown -h, powering down and powering up
the PC.

Can somebody explain the trouble?

Best regards to all all and thx to replying people,

Andrea

^ permalink raw reply	[flat|nested] 20+ messages in thread

* Re: [PATCH] Fix 2.5.3pre reiserfs BUG() at boot time
  2002-01-25 22:15         ` Andi Kleen
  2002-01-25 22:32           ` eth0: NULL pointer encountered in RX ring, skipping Andrea Ferraris
@ 2002-01-25 22:41           ` Andreas Dilger
  2002-01-26  7:24           ` Kai Henningsen
  2002-01-27 23:02           ` Alessandro Suardi
  3 siblings, 0 replies; 20+ messages in thread
From: Andreas Dilger @ 2002-01-25 22:41 UTC (permalink / raw)
  To: Andi Kleen; +Cc: Linus Torvalds, John Levon, linux-kernel, davej

On Jan 25, 2002  23:15 +0100, Andi Kleen wrote:
> On Fri, Jan 25, 2002 at 01:38:14PM -0700, Andreas Dilger wrote:
> > When calling kmem_cache_destroy() on a non-empty slab we should just
> > malloc some memory with the old cache name + "_leaked" for the name
> > pointer.  At least then we have a sane chance of figuring out what caused
> > the problem, instead of having a bunch of "broken" entries in the table,
> > and remove the above "broken" check entirely (we will always have a name).
> 
> I don't like this because it complicates the code too much. 
> "broken" should be enough to debug it. 

Hmm, then you could just point to a static "broken" name at
kmem_cache_destroy() time and save yourself the get_user() checks
for each access to the name.  This would gratuitously overwrite
the name for non-modular caches that failed to unload, but I doubt
that such things exist.

Cheers, Andreas
--
Andreas Dilger
http://sourceforge.net/projects/ext2resize/
http://www-mddsp.enel.ucalgary.ca/People/adilger/


^ permalink raw reply	[flat|nested] 20+ messages in thread

* Re: eth0: NULL pointer encountered in RX ring, skipping
  2002-01-25 22:32           ` eth0: NULL pointer encountered in RX ring, skipping Andrea Ferraris
@ 2002-01-25 22:59             ` Jeff Garzik
  2002-01-26 10:11               ` Andrea Ferraris
  0 siblings, 1 reply; 20+ messages in thread
From: Jeff Garzik @ 2002-01-25 22:59 UTC (permalink / raw)
  To: andrea_ferraris; +Cc: linux-kernel

Well, the code says "this should never happen" ;-)

But anyway, it is probably a temporary memory allocation failure.  The
code handles this case.

	Jeff



-- 
Jeff Garzik      | "I went through my candy like hot oatmeal
Building 1024    |  through an internally-buttered weasel."
MandrakeSoft     |             - goats.com

^ permalink raw reply	[flat|nested] 20+ messages in thread

* Re: [PATCH] Fix 2.5.3pre reiserfs BUG() at boot time
  2002-01-25 22:15         ` Andi Kleen
  2002-01-25 22:32           ` eth0: NULL pointer encountered in RX ring, skipping Andrea Ferraris
  2002-01-25 22:41           ` [PATCH] Fix 2.5.3pre reiserfs BUG() at boot time Andreas Dilger
@ 2002-01-26  7:24           ` Kai Henningsen
  2002-01-27 23:02           ` Alessandro Suardi
  3 siblings, 0 replies; 20+ messages in thread
From: Kai Henningsen @ 2002-01-26  7:24 UTC (permalink / raw)
  To: ak; +Cc: linux-kernel

ak@suse.de (Andi Kleen)  wrote on 25.01.02 in <20020125231555.A22583@wotan.suse.de>:

> +/* Must match cache_sizes above. Out of line to keep cache footprint low.
> */ +#define CN(x) { x, x ## "(DMA)" }
> +static char cache_names[][2][18] = {


> +	CN("size-128"),
> +	CN("size-256"),
> +	CN("size-512"),


What on earth is that ## for?! If that actually works, I strongly suspect  
that it is a bug in cpp.

MfG Kai

^ permalink raw reply	[flat|nested] 20+ messages in thread

* Re: eth0: NULL pointer encountered in RX ring, skipping
  2002-01-25 22:59             ` Jeff Garzik
@ 2002-01-26 10:11               ` Andrea Ferraris
  2002-01-26 15:24                 ` OPS: " Andrea Ferraris
  0 siblings, 1 reply; 20+ messages in thread
From: Andrea Ferraris @ 2002-01-26 10:11 UTC (permalink / raw)
  To: Jeff Garzik; +Cc: linux-kernel

Friday 25 January 2002 23:59, Jeff Garzik scrisse:
> Well, the code says "this should never happen" ;-)
>
> But anyway, it is probably a temporary memory allocation failure.  The
> code handles this case.

Yes, but I think that isn't normal to have to do a cold reboot to have the 
machine again working on the network. It is, maybe that the code doesn't 
handle so well this case. Do you suggest a kernel upgrade?

Andrea

^ permalink raw reply	[flat|nested] 20+ messages in thread

* OPS: eth0: NULL pointer encountered in RX ring, skipping
  2002-01-26 10:11               ` Andrea Ferraris
@ 2002-01-26 15:24                 ` Andrea Ferraris
  0 siblings, 0 replies; 20+ messages in thread
From: Andrea Ferraris @ 2002-01-26 15:24 UTC (permalink / raw)
  To: Jeff Garzik; +Cc: linux-kernel

Saturday 26 January 2002 11:11, Andrea Ferraris scrisse:
> Friday 25 January 2002 23:59, Jeff Garzik scrisse:
> > Well, the code says "this should never happen" ;-)
> >
> > But anyway, it is probably a temporary memory allocation failure.  The
> > code handles this case.
>
> Yes, but I think that isn't normal to have to do a cold reboot to have the
> machine again working on the network. It is, maybe that the code doesn't
> handle so well this case. Do you suggest a kernel upgrade?

Ops ... sorry. It was a 2.2.16 kernel.

Andrea

^ permalink raw reply	[flat|nested] 20+ messages in thread

* Re: [PATCH] Fix 2.5.3pre reiserfs BUG() at boot time
  2002-01-25 22:15         ` Andi Kleen
                             ` (2 preceding siblings ...)
  2002-01-26  7:24           ` Kai Henningsen
@ 2002-01-27 23:02           ` Alessandro Suardi
  2002-01-28  0:01             ` Andi Kleen
  3 siblings, 1 reply; 20+ messages in thread
From: Alessandro Suardi @ 2002-01-27 23:02 UTC (permalink / raw)
  To: Andi Kleen; +Cc: Linus Torvalds, John Levon, linux-kernel, davej

Andi Kleen wrote:
> 
> On Fri, Jan 25, 2002 at 01:38:14PM -0700, Andreas Dilger wrote:
> > So, what exactly does the above do now (hint: p and pc are both local
> > so they cannot be referenced anywhere else)?  It used to check that you
> > weren't trying to add two caches with the same name.  This isn't
> > possible with caches from broken modules anymore as they have no name.
> 
> I have fixed the loop now to check for names again.
> 
> > When calling kmem_cache_destroy() on a non-empty slab we should just
> > malloc some memory with the old cache name + "_leaked" for the name
> > pointer.  At least then we have a sane chance of figuring out what caused
> > the problem, instead of having a bunch of "broken" entries in the table,
> > and remove the above "broken" check entirely (we will always have a name).
> 
> I don't like this because it complicates the code too much.
> "broken" should be enough to debug it.
> 
> New patch appended. Linus please apply if you didn't already.
> 
> -Andi



2.5.3-pre5 + this patch still can't boot my system. I haven't had
 time to copy down oops at boot, will do if needed.

Thanks & ciao,



> Index: mm/slab.c
> ===================================================================
> RCS file: /cvs/linux/mm/slab.c,v
> retrieving revision 1.66
> diff -u -u -r1.66 slab.c
> --- mm/slab.c   2002/01/08 16:00:20     1.66
> +++ mm/slab.c   2002/01/25 22:14:40
> @@ -186,8 +186,6 @@
>   * manages a cache.
>   */
> 
> -#define CACHE_NAMELEN  20      /* max name length for a slab cache */
> -
>  struct kmem_cache_s {
>  /* 1) each alloc & free */
>         /* full, partial first, then free */
> @@ -225,7 +223,7 @@
>         unsigned long           failures;
> 
>  /* 3) cache creation/removal */
> -       char                    name[CACHE_NAMELEN];
> +       const char                      *name;
>         struct list_head        next;
>  #ifdef CONFIG_SMP
>  /* 4) per-cpu data */
> @@ -335,6 +333,7 @@
>         kmem_cache_t    *cs_dmacachep;
>  } cache_sizes_t;
> 
> +/* These are the default caches for kmalloc. Custom caches can have other sizes. */
>  static cache_sizes_t cache_sizes[] = {
>  #if PAGE_SIZE == 4096
>         {    32,        NULL, NULL},
> @@ -353,6 +352,26 @@
>         {131072,        NULL, NULL},
>         {     0,        NULL, NULL}
>  };
> +/* Must match cache_sizes above. Out of line to keep cache footprint low. */
> +#define CN(x) { x, x ## "(DMA)" }
> +static char cache_names[][2][18] = {
> +#if PAGE_SIZE == 4096
> +       CN("size-32"),
> +#endif
> +       CN("size-64"),
> +       CN("size-128"),
> +       CN("size-256"),
> +       CN("size-512"),
> +       CN("size-1024"),
> +       CN("size-2048"),
> +       CN("size-4096"),
> +       CN("size-8192"),
> +       CN("size-16384"),
> +       CN("size-32768"),
> +       CN("size-65536"),
> +       CN("size-131072")
> +};
> +#undef CN
> 
>  /* internal cache of cache description objs */
>  static kmem_cache_t cache_cache = {
> @@ -437,7 +456,6 @@
>  void __init kmem_cache_sizes_init(void)
>  {
>         cache_sizes_t *sizes = cache_sizes;
> -       char name[20];
>         /*
>          * Fragmentation resistance on low memory - only use bigger
>          * page orders on machines with more than 32MB of memory.
> @@ -450,9 +468,9 @@
>                  * eliminates "false sharing".
>                  * Note for systems short on memory removing the alignment will
>                  * allow tighter packing of the smaller caches. */
> -               sprintf(name,"size-%Zd",sizes->cs_size);
>                 if (!(sizes->cs_cachep =
> -                       kmem_cache_create(name, sizes->cs_size,
> +                       kmem_cache_create(cache_names[sizes-cache_sizes][0],
> +                                         sizes->cs_size,
>                                         0, SLAB_HWCACHE_ALIGN, NULL, NULL))) {
>                         BUG();
>                 }
> @@ -462,9 +480,10 @@
>                         offslab_limit = sizes->cs_size-sizeof(slab_t);
>                         offslab_limit /= 2;
>                 }
> -               sprintf(name, "size-%Zd(DMA)",sizes->cs_size);
> -               sizes->cs_dmacachep = kmem_cache_create(name, sizes->cs_size, 0,
> -                             SLAB_CACHE_DMA|SLAB_HWCACHE_ALIGN, NULL, NULL);
> +               sizes->cs_dmacachep = kmem_cache_create(
> +                   cache_names[sizes-cache_sizes][1],
> +                       sizes->cs_size, 0,
> +                       SLAB_CACHE_DMA|SLAB_HWCACHE_ALIGN, NULL, NULL);
>                 if (!sizes->cs_dmacachep)
>                         BUG();
>                 sizes++;
> @@ -604,6 +623,11 @@
>   * Cannot be called within a int, but can be interrupted.
>   * The @ctor is run when new pages are allocated by the cache
>   * and the @dtor is run before the pages are handed back.
> + *
> + * @name must be valid until the cache is destroyed. This implies that
> + * the module calling this has to destroy the cache before getting
> + * unloaded.
> + *
>   * The flags are
>   *
>   * %SLAB_POISON - Poison the slab with a known test pattern (a5a5a5a5)
> @@ -632,7 +656,6 @@
>          * Sanity checks... these are all serious usage bugs.
>          */
>         if ((!name) ||
> -               ((strlen(name) >= CACHE_NAMELEN - 1)) ||
>                 in_interrupt() ||
>                 (size < BYTES_PER_WORD) ||
>                 (size > (1<<MAX_OBJ_ORDER)*PAGE_SIZE) ||
> @@ -797,8 +820,7 @@
>                 cachep->slabp_cache = kmem_find_general_cachep(slab_size,0);
>         cachep->ctor = ctor;
>         cachep->dtor = dtor;
> -       /* Copy name over so we don't have problems with unloaded modules */
> -       strcpy(cachep->name, name);
> +       cachep->name = name;
> 
>  #ifdef CONFIG_SMP
>         if (g_cpucache_up)
> @@ -811,10 +833,11 @@
> 
>                 list_for_each(p, &cache_chain) {
>                         kmem_cache_t *pc = list_entry(p, kmem_cache_t, next);
> -
> -                       /* The name field is constant - no lock needed. */
> -                       if (!strcmp(pc->name, name))
> -                               BUG();
> +                       char tmp;
> +                       if (get_user(tmp,pc->name))
> +                               continue;
> +                       if (!strcmp(pc->name,name))
> +                               BUG();
>                 }
>         }
> 
> @@ -1878,6 +1901,7 @@
>                 unsigned long   num_objs;
>                 unsigned long   active_slabs = 0;
>                 unsigned long   num_slabs;
> +               const char *name;
>                 cachep = list_entry(p, kmem_cache_t, next);
> 
>                 spin_lock_irq(&cachep->spinlock);
> @@ -1906,8 +1930,15 @@
>                 num_slabs+=active_slabs;
>                 num_objs = num_slabs*cachep->num;
> 
> +               name = cachep->name;
> +               {
> +               char tmp;
> +               if (get_user(tmp, name))
> +                       name = "broken";
> +               }
> +
>                 len += sprintf(page+len, "%-17s %6lu %6lu %6u %4lu %4lu %4u",
> -                       cachep->name, active_objs, num_objs, cachep->objsize,
> +                       name, active_objs, num_objs, cachep->objsize,
>                         active_slabs, num_slabs, (1<<cachep->gfporder));
> 
>  #if STATS

--alessandro

 "this machine will, will not communicate
   these thoughts and the strain I am under
  be a world child, form a circle before we all go under"
                         (Radiohead, "Street Spirit [fade out]")

^ permalink raw reply	[flat|nested] 20+ messages in thread

* Re: [PATCH] Fix 2.5.3pre reiserfs BUG() at boot time
  2002-01-27 23:02           ` Alessandro Suardi
@ 2002-01-28  0:01             ` Andi Kleen
  2002-01-28 11:07               ` Jens Axboe
  2002-01-29 13:14               ` Alessandro Suardi
  0 siblings, 2 replies; 20+ messages in thread
From: Andi Kleen @ 2002-01-28  0:01 UTC (permalink / raw)
  To: Alessandro Suardi
  Cc: Andi Kleen, Linus Torvalds, John Levon, linux-kernel, davej

On Mon, Jan 28, 2002 at 12:02:54AM +0100, Alessandro Suardi wrote:
> 
> 2.5.3-pre5 + this patch still can't boot my system. I haven't had
>  time to copy down oops at boot, will do if needed.

Please do. I cannot see anything in the patch that should prevent bootup
though, so I would also recommend a make clean and recompile first just
to make sure it isn't a broken build. 

-Andi

^ permalink raw reply	[flat|nested] 20+ messages in thread

* Re: [PATCH] Fix 2.5.3pre reiserfs BUG() at boot time
  2002-01-28  0:01             ` Andi Kleen
@ 2002-01-28 11:07               ` Jens Axboe
  2002-01-28 14:53                 ` Andi Kleen
  2002-01-29 13:14               ` Alessandro Suardi
  1 sibling, 1 reply; 20+ messages in thread
From: Jens Axboe @ 2002-01-28 11:07 UTC (permalink / raw)
  To: Andi Kleen
  Cc: Alessandro Suardi, Linus Torvalds, John Levon, linux-kernel,
	davej

On Mon, Jan 28 2002, Andi Kleen wrote:
> On Mon, Jan 28, 2002 at 12:02:54AM +0100, Alessandro Suardi wrote:
> > 
> > 2.5.3-pre5 + this patch still can't boot my system. I haven't had
> >  time to copy down oops at boot, will do if needed.
> 
> Please do. I cannot see anything in the patch that should prevent bootup
> though, so I would also recommend a make clean and recompile first just
> to make sure it isn't a broken build. 

Probably the kmem_cache_create 'name too long' bug that Viro pointed out
to me. fs/reiserfs/super.c:init_inodecache(). Change the name passed to
kmem_cache_create to something shorter.

-- 
Jens Axboe


^ permalink raw reply	[flat|nested] 20+ messages in thread

* Re: [PATCH] Fix 2.5.3pre reiserfs BUG() at boot time
  2002-01-28 11:07               ` Jens Axboe
@ 2002-01-28 14:53                 ` Andi Kleen
  2002-01-28 14:54                   ` Jens Axboe
  0 siblings, 1 reply; 20+ messages in thread
From: Andi Kleen @ 2002-01-28 14:53 UTC (permalink / raw)
  To: Jens Axboe
  Cc: Andi Kleen, Alessandro Suardi, Linus Torvalds, John Levon,
	linux-kernel, davej

On Mon, Jan 28, 2002 at 12:07:47PM +0100, Jens Axboe wrote:
> On Mon, Jan 28 2002, Andi Kleen wrote:
> > On Mon, Jan 28, 2002 at 12:02:54AM +0100, Alessandro Suardi wrote:
> > > 
> > > 2.5.3-pre5 + this patch still can't boot my system. I haven't had
> > >  time to copy down oops at boot, will do if needed.
> > 
> > Please do. I cannot see anything in the patch that should prevent bootup
> > though, so I would also recommend a make clean and recompile first just
> > to make sure it isn't a broken build. 
> 
> Probably the kmem_cache_create 'name too long' bug that Viro pointed out
> to me. fs/reiserfs/super.c:init_inodecache(). Change the name passed to
> kmem_cache_create to something shorter.

The patch he tried removed the check of the name length ;) 

-Andi

^ permalink raw reply	[flat|nested] 20+ messages in thread

* Re: [PATCH] Fix 2.5.3pre reiserfs BUG() at boot time
  2002-01-28 14:53                 ` Andi Kleen
@ 2002-01-28 14:54                   ` Jens Axboe
  0 siblings, 0 replies; 20+ messages in thread
From: Jens Axboe @ 2002-01-28 14:54 UTC (permalink / raw)
  To: Andi Kleen
  Cc: Alessandro Suardi, Linus Torvalds, John Levon, linux-kernel,
	davej

On Mon, Jan 28 2002, Andi Kleen wrote:
> On Mon, Jan 28, 2002 at 12:07:47PM +0100, Jens Axboe wrote:
> > On Mon, Jan 28 2002, Andi Kleen wrote:
> > > On Mon, Jan 28, 2002 at 12:02:54AM +0100, Alessandro Suardi wrote:
> > > > 
> > > > 2.5.3-pre5 + this patch still can't boot my system. I haven't had
> > > >  time to copy down oops at boot, will do if needed.
> > > 
> > > Please do. I cannot see anything in the patch that should prevent bootup
> > > though, so I would also recommend a make clean and recompile first just
> > > to make sure it isn't a broken build. 
> > 
> > Probably the kmem_cache_create 'name too long' bug that Viro pointed out
> > to me. fs/reiserfs/super.c:init_inodecache(). Change the name passed to
> > kmem_cache_create to something shorter.
> 
> The patch he tried removed the check of the name length ;) 

Heh, my cover is blown -- I didn't read it :-)

-- 
Jens Axboe


^ permalink raw reply	[flat|nested] 20+ messages in thread

* Re: [PATCH] Fix 2.5.3pre reiserfs BUG() at boot time
  2002-01-28  0:01             ` Andi Kleen
  2002-01-28 11:07               ` Jens Axboe
@ 2002-01-29 13:14               ` Alessandro Suardi
  1 sibling, 0 replies; 20+ messages in thread
From: Alessandro Suardi @ 2002-01-29 13:14 UTC (permalink / raw)
  To: Andi Kleen; +Cc: Linus Torvalds, John Levon, linux-kernel, davej

Andi Kleen wrote:
> 
> On Mon, Jan 28, 2002 at 12:02:54AM +0100, Alessandro Suardi wrote:
> >
> > 2.5.3-pre5 + this patch still can't boot my system. I haven't had
> >  time to copy down oops at boot, will do if needed.
> 
> Please do. I cannot see anything in the patch that should prevent bootup
> though, so I would also recommend a make clean and recompile first just
> to make sure it isn't a broken build.

I ended up away from email for a couple of days and saw -pre6;
 re-patched from 2.5.2 to -pre6, boot is okay .

Thanks & ciao,

--alessandro

 "this machine will, will not communicate
   these thoughts and the strain I am under
  be a world child, form a circle before we all go under"
                         (Radiohead, "Street Spirit [fade out]")

^ permalink raw reply	[flat|nested] 20+ messages in thread

end of thread, other threads:[~2002-01-29 13:13 UTC | newest]

Thread overview: 20+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2002-01-25 17:28 [PATCH] Fix 2.5.3pre reiserfs BUG() at boot time Andi Kleen
2002-01-25 18:01 ` John Levon
2002-01-25 18:08   ` Linus Torvalds
2002-01-25 18:31     ` Andreas Dilger
2002-01-25 18:46     ` Hans Reiser
2002-01-25 19:49     ` Andi Kleen
2002-01-25 20:38       ` Andreas Dilger
2002-01-25 22:15         ` Andi Kleen
2002-01-25 22:32           ` eth0: NULL pointer encountered in RX ring, skipping Andrea Ferraris
2002-01-25 22:59             ` Jeff Garzik
2002-01-26 10:11               ` Andrea Ferraris
2002-01-26 15:24                 ` OPS: " Andrea Ferraris
2002-01-25 22:41           ` [PATCH] Fix 2.5.3pre reiserfs BUG() at boot time Andreas Dilger
2002-01-26  7:24           ` Kai Henningsen
2002-01-27 23:02           ` Alessandro Suardi
2002-01-28  0:01             ` Andi Kleen
2002-01-28 11:07               ` Jens Axboe
2002-01-28 14:53                 ` Andi Kleen
2002-01-28 14:54                   ` Jens Axboe
2002-01-29 13:14               ` Alessandro Suardi

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox