* [PATCH] Fix 2.5.3pre reiserfs BUG() at boot time
@ 2002-01-25 17:28 Andi Kleen
2002-01-25 18:01 ` John Levon
0 siblings, 1 reply; 20+ messages in thread
From: Andi Kleen @ 2002-01-25 17:28 UTC (permalink / raw)
To: torvalds; +Cc: linux-kernel, davej
[resent because t-offline's outgoing mail server seems to be eating/
not delivering mail. sorry if it appears multiple times]
This patch fixes an reiserfs BUG() at boot time introduced by the
inode cleanups. The problem is that it passes a 20 char name to
kmem_cache_create() ("reiserfs_inode_cache") but kmem_cache_create()
only allows 19 character names and BUG()s for longer names.
The patch fixes this in a low tech approach. It's required to boot
a 2.5.3preX machine with reiserfs compiled in.
-Andi
Index: fs/reiserfs/super.c
===================================================================
RCS file: /cvs/linux/fs/reiserfs/super.c,v
retrieving revision 1.17
diff -u -u -r1.17 super.c
--- fs/reiserfs/super.c 2002/01/24 14:07:54 1.17
+++ fs/reiserfs/super.c 2002/01/24 22:03:34
@@ -153,7 +153,7 @@
static int init_inodecache(void)
{
- reiserfs_inode_cachep = kmem_cache_create("reiserfs_inode_cache",
+ reiserfs_inode_cachep = kmem_cache_create("reiser_inode_cache",
sizeof(struct reiserfs_inode_info),
0, SLAB_HWCACHE_ALIGN,
init_once, NULL);
^ permalink raw reply [flat|nested] 20+ messages in thread* Re: [PATCH] Fix 2.5.3pre reiserfs BUG() at boot time 2002-01-25 17:28 [PATCH] Fix 2.5.3pre reiserfs BUG() at boot time Andi Kleen @ 2002-01-25 18:01 ` John Levon 2002-01-25 18:08 ` Linus Torvalds 0 siblings, 1 reply; 20+ messages in thread From: John Levon @ 2002-01-25 18:01 UTC (permalink / raw) To: Andi Kleen; +Cc: torvalds, linux-kernel, davej On Fri, Jan 25, 2002 at 06:28:08PM +0100, Andi Kleen wrote: > kmem_cache_create() ("reiserfs_inode_cache") but kmem_cache_create() > only allows 19 character names and BUG()s for longer names. please apply this too then. regards john --- mm/slab.c.old Fri Jan 25 17:54:11 2002 +++ mm/slab.c Fri Jan 25 17:55:18 2002 @@ -599,7 +599,10 @@ * @dtor: A destructor for the objects. * * Returns a ptr to the cache on success, NULL on failure. - * Cannot be called within a int, but can be interrupted. + * Cannot be called within a interrupt, but can be interrupted. + * + * strlen(@name) must be less than %CACHE_NAMELEN. + * * The @ctor is run when new pages are allocated by the cache * and the @dtor is run before the pages are handed back. * The flags are ^ permalink raw reply [flat|nested] 20+ messages in thread
* Re: [PATCH] Fix 2.5.3pre reiserfs BUG() at boot time 2002-01-25 18:01 ` John Levon @ 2002-01-25 18:08 ` Linus Torvalds 2002-01-25 18:31 ` Andreas Dilger ` (2 more replies) 0 siblings, 3 replies; 20+ messages in thread From: Linus Torvalds @ 2002-01-25 18:08 UTC (permalink / raw) To: John Levon; +Cc: Andi Kleen, linux-kernel, davej On Fri, 25 Jan 2002, John Levon wrote: > > please apply this too then. I would prefer instead just avoiding the copy altogether, and just save the name pointer - with no length restrictions. Right now the code has the comment /* Copy name over so we don't have problems with unloaded modules */ but that was written before "kmem_cache_destroy()" existed, and we should long ago have fixed any modules that don't properly destroy their caches when they exit (and yes, I know the difference between "should" and "did", but that's not an excuse for a bad interface). Linus ^ permalink raw reply [flat|nested] 20+ messages in thread
* Re: [PATCH] Fix 2.5.3pre reiserfs BUG() at boot time 2002-01-25 18:08 ` Linus Torvalds @ 2002-01-25 18:31 ` Andreas Dilger 2002-01-25 18:46 ` Hans Reiser 2002-01-25 19:49 ` Andi Kleen 2 siblings, 0 replies; 20+ messages in thread From: Andreas Dilger @ 2002-01-25 18:31 UTC (permalink / raw) To: Linus Torvalds; +Cc: John Levon, Andi Kleen, linux-kernel, davej On Jan 25, 2002 10:08 -0800, Linus Torvalds wrote: > I would prefer instead just avoiding the copy altogether, and just save > the name pointer - with no length restrictions. > > Right now the code has the comment > > /* Copy name over so we don't have problems with unloaded modules */ Yes, I put that in. > but that was written before "kmem_cache_destroy()" existed, and we should > long ago have fixed any modules that don't properly destroy their caches > when they exit (and yes, I know the difference between "should" and "did", > but that's not an excuse for a bad interface). The problem is that if, for some reason, the cache is NOT empty when you call kmem_cache_destroy(), it will not be freed, but the module exits anyways. Then, any access to /proc/slabinfo will OOPS. Yes, code should be written correctly so that its slab is empty when it exits, but I'd rather have a _bit_ of safety here so that you can at least check slabinfo when you get a kernel message "slab is not empty" (or whatever it is) so you can at least try and investigate the problem. The other alternative is to BUG with enough information to figure out the status of this cache if you try to free a non-empty cache. At least then you would get some data at the time the real problem happens as opposed to killing some random process later that tries to read slabinfo. Cheers, Andreas -- Andreas Dilger http://sourceforge.net/projects/ext2resize/ http://www-mddsp.enel.ucalgary.ca/People/adilger/ ^ permalink raw reply [flat|nested] 20+ messages in thread
* Re: [PATCH] Fix 2.5.3pre reiserfs BUG() at boot time 2002-01-25 18:08 ` Linus Torvalds 2002-01-25 18:31 ` Andreas Dilger @ 2002-01-25 18:46 ` Hans Reiser 2002-01-25 19:49 ` Andi Kleen 2 siblings, 0 replies; 20+ messages in thread From: Hans Reiser @ 2002-01-25 18:46 UTC (permalink / raw) To: Linus Torvalds; +Cc: John Levon, Andi Kleen, linux-kernel, davej We are preparing a large set of patches (each to be sent as a separate email per your SOP, but all tested together in a thorough testing procedure), which includes this patch, plus the kdev fix, plus all 2.4 fixes. We expect to send it out on monday, we just found a bug in the patches as a set that was not present in them individually. Hans ^ permalink raw reply [flat|nested] 20+ messages in thread
* Re: [PATCH] Fix 2.5.3pre reiserfs BUG() at boot time 2002-01-25 18:08 ` Linus Torvalds 2002-01-25 18:31 ` Andreas Dilger 2002-01-25 18:46 ` Hans Reiser @ 2002-01-25 19:49 ` Andi Kleen 2002-01-25 20:38 ` Andreas Dilger 2 siblings, 1 reply; 20+ messages in thread From: Andi Kleen @ 2002-01-25 19:49 UTC (permalink / raw) To: Linus Torvalds; +Cc: John Levon, Andi Kleen, linux-kernel, davej On Fri, Jan 25, 2002 at 10:08:56AM -0800, Linus Torvalds wrote: > > On Fri, 25 Jan 2002, John Levon wrote: > > > > please apply this too then. > > I would prefer instead just avoiding the copy altogether, and just save > the name pointer - with no length restrictions. > > Right now the code has the comment > > /* Copy name over so we don't have problems with unloaded modules */ > > but that was written before "kmem_cache_destroy()" existed, and we should > long ago have fixed any modules that don't properly destroy their caches > when they exit (and yes, I know the difference between "should" and "did", > but that's not an excuse for a bad interface). This patch implements your suggestion. It works for me, but I haven't audited the whole tree if they do kmem_cache_destroy as needed. It tries to avoid oopses for unmapped or bogus names on /proc/slabinfo reading at least. Also fixes an warning in slab.c -Andi Index: mm/slab.c =================================================================== RCS file: /cvs/linux/mm/slab.c,v retrieving revision 1.66 diff -u -u -r1.66 slab.c --- mm/slab.c 2002/01/08 16:00:20 1.66 +++ mm/slab.c 2002/01/25 19:47:20 @@ -186,8 +186,6 @@ * manages a cache. */ -#define CACHE_NAMELEN 20 /* max name length for a slab cache */ - struct kmem_cache_s { /* 1) each alloc & free */ /* full, partial first, then free */ @@ -225,7 +223,7 @@ unsigned long failures; /* 3) cache creation/removal */ - char name[CACHE_NAMELEN]; + const char *name; struct list_head next; #ifdef CONFIG_SMP /* 4) per-cpu data */ @@ -335,6 +333,7 @@ kmem_cache_t *cs_dmacachep; } cache_sizes_t; +/* These are the default caches for kmalloc. Custom caches can have other sizes. */ static cache_sizes_t cache_sizes[] = { #if PAGE_SIZE == 4096 { 32, NULL, NULL}, @@ -353,6 +352,26 @@ {131072, NULL, NULL}, { 0, NULL, NULL} }; +/* Must match cache_sizes above. Out of line to keep cache footprint low. */ +#define CN(x) { x, x ## "(DMA)" } +static char cache_names[][2][18] = { +#if PAGE_SIZE == 4096 + CN("size-32"), +#endif + CN("size-64"), + CN("size-128"), + CN("size-256"), + CN("size-512"), + CN("size-1024"), + CN("size-2048"), + CN("size-4096"), + CN("size-8192"), + CN("size-16384"), + CN("size-32768"), + CN("size-65536"), + CN("size-131072") +}; +#undef CN /* internal cache of cache description objs */ static kmem_cache_t cache_cache = { @@ -437,7 +456,6 @@ void __init kmem_cache_sizes_init(void) { cache_sizes_t *sizes = cache_sizes; - char name[20]; /* * Fragmentation resistance on low memory - only use bigger * page orders on machines with more than 32MB of memory. @@ -450,9 +468,9 @@ * eliminates "false sharing". * Note for systems short on memory removing the alignment will * allow tighter packing of the smaller caches. */ - sprintf(name,"size-%Zd",sizes->cs_size); if (!(sizes->cs_cachep = - kmem_cache_create(name, sizes->cs_size, + kmem_cache_create(cache_names[sizes-cache_sizes][0], + sizes->cs_size, 0, SLAB_HWCACHE_ALIGN, NULL, NULL))) { BUG(); } @@ -462,9 +480,10 @@ offslab_limit = sizes->cs_size-sizeof(slab_t); offslab_limit /= 2; } - sprintf(name, "size-%Zd(DMA)",sizes->cs_size); - sizes->cs_dmacachep = kmem_cache_create(name, sizes->cs_size, 0, - SLAB_CACHE_DMA|SLAB_HWCACHE_ALIGN, NULL, NULL); + sizes->cs_dmacachep = kmem_cache_create( + cache_names[sizes-cache_sizes][1], + sizes->cs_size, 0, + SLAB_CACHE_DMA|SLAB_HWCACHE_ALIGN, NULL, NULL); if (!sizes->cs_dmacachep) BUG(); sizes++; @@ -604,6 +623,11 @@ * Cannot be called within a int, but can be interrupted. * The @ctor is run when new pages are allocated by the cache * and the @dtor is run before the pages are handed back. + * + * @name must be valid until the cache is destroyed. This implies that + * the module calling this has to destroy the cache before getting + * unloaded. + * * The flags are * * %SLAB_POISON - Poison the slab with a known test pattern (a5a5a5a5) @@ -632,7 +656,6 @@ * Sanity checks... these are all serious usage bugs. */ if ((!name) || - ((strlen(name) >= CACHE_NAMELEN - 1)) || in_interrupt() || (size < BYTES_PER_WORD) || (size > (1<<MAX_OBJ_ORDER)*PAGE_SIZE) || @@ -797,8 +820,7 @@ cachep->slabp_cache = kmem_find_general_cachep(slab_size,0); cachep->ctor = ctor; cachep->dtor = dtor; - /* Copy name over so we don't have problems with unloaded modules */ - strcpy(cachep->name, name); + cachep->name = name; #ifdef CONFIG_SMP if (g_cpucache_up) @@ -810,11 +832,8 @@ struct list_head *p; list_for_each(p, &cache_chain) { - kmem_cache_t *pc = list_entry(p, kmem_cache_t, next); - - /* The name field is constant - no lock needed. */ - if (!strcmp(pc->name, name)) - BUG(); + kmem_cache_t *pc; + pc = list_entry(p, kmem_cache_t, next); } } @@ -1878,6 +1897,7 @@ unsigned long num_objs; unsigned long active_slabs = 0; unsigned long num_slabs; + const char *name; cachep = list_entry(p, kmem_cache_t, next); spin_lock_irq(&cachep->spinlock); @@ -1906,8 +1926,15 @@ num_slabs+=active_slabs; num_objs = num_slabs*cachep->num; + name = cachep->name; + { + char tmp; + if (get_user(tmp, name)) + name = "broken"; + } + len += sprintf(page+len, "%-17s %6lu %6lu %6u %4lu %4lu %4u", - cachep->name, active_objs, num_objs, cachep->objsize, + name, active_objs, num_objs, cachep->objsize, active_slabs, num_slabs, (1<<cachep->gfporder)); #if STATS ^ permalink raw reply [flat|nested] 20+ messages in thread
* Re: [PATCH] Fix 2.5.3pre reiserfs BUG() at boot time 2002-01-25 19:49 ` Andi Kleen @ 2002-01-25 20:38 ` Andreas Dilger 2002-01-25 22:15 ` Andi Kleen 0 siblings, 1 reply; 20+ messages in thread From: Andreas Dilger @ 2002-01-25 20:38 UTC (permalink / raw) To: Andi Kleen; +Cc: Linus Torvalds, John Levon, linux-kernel, davej On Jan 25, 2002 20:49 +0100, Andi Kleen wrote: > @@ -810,11 +832,8 @@ > struct list_head *p; > > list_for_each(p, &cache_chain) { > - kmem_cache_t *pc = list_entry(p, kmem_cache_t, next); > - > - /* The name field is constant - no lock needed. */ > - if (!strcmp(pc->name, name)) > - BUG(); > + kmem_cache_t *pc; > + pc = list_entry(p, kmem_cache_t, next); > } > } > So, what exactly does the above do now (hint: p and pc are both local so they cannot be referenced anywhere else)? It used to check that you weren't trying to add two caches with the same name. This isn't possible with caches from broken modules anymore as they have no name. In the end, it is mostly irrelevant if we have duplicate names in the slab cache, because you can't "attach" to a cache by name (you can only "create" a cache and access it via a pointer). We may as well just remove the whole loop above, since it doesn't do anything anymore. > + name = cachep->name; > + { > + char tmp; > + if (get_user(tmp, name)) > + name = "broken"; > + } When calling kmem_cache_destroy() on a non-empty slab we should just malloc some memory with the old cache name + "_leaked" for the name pointer. At least then we have a sane chance of figuring out what caused the problem, instead of having a bunch of "broken" entries in the table, and remove the above "broken" check entirely (we will always have a name). Cheers, Andreas -- Andreas Dilger http://sourceforge.net/projects/ext2resize/ http://www-mddsp.enel.ucalgary.ca/People/adilger/ ^ permalink raw reply [flat|nested] 20+ messages in thread
* Re: [PATCH] Fix 2.5.3pre reiserfs BUG() at boot time 2002-01-25 20:38 ` Andreas Dilger @ 2002-01-25 22:15 ` Andi Kleen 2002-01-25 22:32 ` eth0: NULL pointer encountered in RX ring, skipping Andrea Ferraris ` (3 more replies) 0 siblings, 4 replies; 20+ messages in thread From: Andi Kleen @ 2002-01-25 22:15 UTC (permalink / raw) To: Andi Kleen, Linus Torvalds, John Levon, linux-kernel, davej On Fri, Jan 25, 2002 at 01:38:14PM -0700, Andreas Dilger wrote: > So, what exactly does the above do now (hint: p and pc are both local > so they cannot be referenced anywhere else)? It used to check that you > weren't trying to add two caches with the same name. This isn't > possible with caches from broken modules anymore as they have no name. I have fixed the loop now to check for names again. > When calling kmem_cache_destroy() on a non-empty slab we should just > malloc some memory with the old cache name + "_leaked" for the name > pointer. At least then we have a sane chance of figuring out what caused > the problem, instead of having a bunch of "broken" entries in the table, > and remove the above "broken" check entirely (we will always have a name). I don't like this because it complicates the code too much. "broken" should be enough to debug it. New patch appended. Linus please apply if you didn't already. -Andi Index: mm/slab.c =================================================================== RCS file: /cvs/linux/mm/slab.c,v retrieving revision 1.66 diff -u -u -r1.66 slab.c --- mm/slab.c 2002/01/08 16:00:20 1.66 +++ mm/slab.c 2002/01/25 22:14:40 @@ -186,8 +186,6 @@ * manages a cache. */ -#define CACHE_NAMELEN 20 /* max name length for a slab cache */ - struct kmem_cache_s { /* 1) each alloc & free */ /* full, partial first, then free */ @@ -225,7 +223,7 @@ unsigned long failures; /* 3) cache creation/removal */ - char name[CACHE_NAMELEN]; + const char *name; struct list_head next; #ifdef CONFIG_SMP /* 4) per-cpu data */ @@ -335,6 +333,7 @@ kmem_cache_t *cs_dmacachep; } cache_sizes_t; +/* These are the default caches for kmalloc. Custom caches can have other sizes. */ static cache_sizes_t cache_sizes[] = { #if PAGE_SIZE == 4096 { 32, NULL, NULL}, @@ -353,6 +352,26 @@ {131072, NULL, NULL}, { 0, NULL, NULL} }; +/* Must match cache_sizes above. Out of line to keep cache footprint low. */ +#define CN(x) { x, x ## "(DMA)" } +static char cache_names[][2][18] = { +#if PAGE_SIZE == 4096 + CN("size-32"), +#endif + CN("size-64"), + CN("size-128"), + CN("size-256"), + CN("size-512"), + CN("size-1024"), + CN("size-2048"), + CN("size-4096"), + CN("size-8192"), + CN("size-16384"), + CN("size-32768"), + CN("size-65536"), + CN("size-131072") +}; +#undef CN /* internal cache of cache description objs */ static kmem_cache_t cache_cache = { @@ -437,7 +456,6 @@ void __init kmem_cache_sizes_init(void) { cache_sizes_t *sizes = cache_sizes; - char name[20]; /* * Fragmentation resistance on low memory - only use bigger * page orders on machines with more than 32MB of memory. @@ -450,9 +468,9 @@ * eliminates "false sharing". * Note for systems short on memory removing the alignment will * allow tighter packing of the smaller caches. */ - sprintf(name,"size-%Zd",sizes->cs_size); if (!(sizes->cs_cachep = - kmem_cache_create(name, sizes->cs_size, + kmem_cache_create(cache_names[sizes-cache_sizes][0], + sizes->cs_size, 0, SLAB_HWCACHE_ALIGN, NULL, NULL))) { BUG(); } @@ -462,9 +480,10 @@ offslab_limit = sizes->cs_size-sizeof(slab_t); offslab_limit /= 2; } - sprintf(name, "size-%Zd(DMA)",sizes->cs_size); - sizes->cs_dmacachep = kmem_cache_create(name, sizes->cs_size, 0, - SLAB_CACHE_DMA|SLAB_HWCACHE_ALIGN, NULL, NULL); + sizes->cs_dmacachep = kmem_cache_create( + cache_names[sizes-cache_sizes][1], + sizes->cs_size, 0, + SLAB_CACHE_DMA|SLAB_HWCACHE_ALIGN, NULL, NULL); if (!sizes->cs_dmacachep) BUG(); sizes++; @@ -604,6 +623,11 @@ * Cannot be called within a int, but can be interrupted. * The @ctor is run when new pages are allocated by the cache * and the @dtor is run before the pages are handed back. + * + * @name must be valid until the cache is destroyed. This implies that + * the module calling this has to destroy the cache before getting + * unloaded. + * * The flags are * * %SLAB_POISON - Poison the slab with a known test pattern (a5a5a5a5) @@ -632,7 +656,6 @@ * Sanity checks... these are all serious usage bugs. */ if ((!name) || - ((strlen(name) >= CACHE_NAMELEN - 1)) || in_interrupt() || (size < BYTES_PER_WORD) || (size > (1<<MAX_OBJ_ORDER)*PAGE_SIZE) || @@ -797,8 +820,7 @@ cachep->slabp_cache = kmem_find_general_cachep(slab_size,0); cachep->ctor = ctor; cachep->dtor = dtor; - /* Copy name over so we don't have problems with unloaded modules */ - strcpy(cachep->name, name); + cachep->name = name; #ifdef CONFIG_SMP if (g_cpucache_up) @@ -811,10 +833,11 @@ list_for_each(p, &cache_chain) { kmem_cache_t *pc = list_entry(p, kmem_cache_t, next); - - /* The name field is constant - no lock needed. */ - if (!strcmp(pc->name, name)) - BUG(); + char tmp; + if (get_user(tmp,pc->name)) + continue; + if (!strcmp(pc->name,name)) + BUG(); } } @@ -1878,6 +1901,7 @@ unsigned long num_objs; unsigned long active_slabs = 0; unsigned long num_slabs; + const char *name; cachep = list_entry(p, kmem_cache_t, next); spin_lock_irq(&cachep->spinlock); @@ -1906,8 +1930,15 @@ num_slabs+=active_slabs; num_objs = num_slabs*cachep->num; + name = cachep->name; + { + char tmp; + if (get_user(tmp, name)) + name = "broken"; + } + len += sprintf(page+len, "%-17s %6lu %6lu %6u %4lu %4lu %4u", - cachep->name, active_objs, num_objs, cachep->objsize, + name, active_objs, num_objs, cachep->objsize, active_slabs, num_slabs, (1<<cachep->gfporder)); #if STATS ^ permalink raw reply [flat|nested] 20+ messages in thread
* eth0: NULL pointer encountered in RX ring, skipping 2002-01-25 22:15 ` Andi Kleen @ 2002-01-25 22:32 ` Andrea Ferraris 2002-01-25 22:59 ` Jeff Garzik 2002-01-25 22:41 ` [PATCH] Fix 2.5.3pre reiserfs BUG() at boot time Andreas Dilger ` (2 subsequent siblings) 3 siblings, 1 reply; 20+ messages in thread From: Andrea Ferraris @ 2002-01-25 22:32 UTC (permalink / raw) To: linux-kernel After such kernel message the network interface (kernel 2.4.2) replied to ping from other network's machines with times varying between 200 and 2000 ms and it was almost impossible to use the network to transferring files to or from this PC. The card is a SIS960 on the mboard on an Acer PC with a Celeron 666 Mhz processor. The problem was solved with a shutdown -h, powering down and powering up the PC. Can somebody explain the trouble? Best regards to all all and thx to replying people, Andrea ^ permalink raw reply [flat|nested] 20+ messages in thread
* Re: eth0: NULL pointer encountered in RX ring, skipping 2002-01-25 22:32 ` eth0: NULL pointer encountered in RX ring, skipping Andrea Ferraris @ 2002-01-25 22:59 ` Jeff Garzik 2002-01-26 10:11 ` Andrea Ferraris 0 siblings, 1 reply; 20+ messages in thread From: Jeff Garzik @ 2002-01-25 22:59 UTC (permalink / raw) To: andrea_ferraris; +Cc: linux-kernel Well, the code says "this should never happen" ;-) But anyway, it is probably a temporary memory allocation failure. The code handles this case. Jeff -- Jeff Garzik | "I went through my candy like hot oatmeal Building 1024 | through an internally-buttered weasel." MandrakeSoft | - goats.com ^ permalink raw reply [flat|nested] 20+ messages in thread
* Re: eth0: NULL pointer encountered in RX ring, skipping 2002-01-25 22:59 ` Jeff Garzik @ 2002-01-26 10:11 ` Andrea Ferraris 2002-01-26 15:24 ` OPS: " Andrea Ferraris 0 siblings, 1 reply; 20+ messages in thread From: Andrea Ferraris @ 2002-01-26 10:11 UTC (permalink / raw) To: Jeff Garzik; +Cc: linux-kernel Friday 25 January 2002 23:59, Jeff Garzik scrisse: > Well, the code says "this should never happen" ;-) > > But anyway, it is probably a temporary memory allocation failure. The > code handles this case. Yes, but I think that isn't normal to have to do a cold reboot to have the machine again working on the network. It is, maybe that the code doesn't handle so well this case. Do you suggest a kernel upgrade? Andrea ^ permalink raw reply [flat|nested] 20+ messages in thread
* OPS: eth0: NULL pointer encountered in RX ring, skipping 2002-01-26 10:11 ` Andrea Ferraris @ 2002-01-26 15:24 ` Andrea Ferraris 0 siblings, 0 replies; 20+ messages in thread From: Andrea Ferraris @ 2002-01-26 15:24 UTC (permalink / raw) To: Jeff Garzik; +Cc: linux-kernel Saturday 26 January 2002 11:11, Andrea Ferraris scrisse: > Friday 25 January 2002 23:59, Jeff Garzik scrisse: > > Well, the code says "this should never happen" ;-) > > > > But anyway, it is probably a temporary memory allocation failure. The > > code handles this case. > > Yes, but I think that isn't normal to have to do a cold reboot to have the > machine again working on the network. It is, maybe that the code doesn't > handle so well this case. Do you suggest a kernel upgrade? Ops ... sorry. It was a 2.2.16 kernel. Andrea ^ permalink raw reply [flat|nested] 20+ messages in thread
* Re: [PATCH] Fix 2.5.3pre reiserfs BUG() at boot time 2002-01-25 22:15 ` Andi Kleen 2002-01-25 22:32 ` eth0: NULL pointer encountered in RX ring, skipping Andrea Ferraris @ 2002-01-25 22:41 ` Andreas Dilger 2002-01-26 7:24 ` Kai Henningsen 2002-01-27 23:02 ` Alessandro Suardi 3 siblings, 0 replies; 20+ messages in thread From: Andreas Dilger @ 2002-01-25 22:41 UTC (permalink / raw) To: Andi Kleen; +Cc: Linus Torvalds, John Levon, linux-kernel, davej On Jan 25, 2002 23:15 +0100, Andi Kleen wrote: > On Fri, Jan 25, 2002 at 01:38:14PM -0700, Andreas Dilger wrote: > > When calling kmem_cache_destroy() on a non-empty slab we should just > > malloc some memory with the old cache name + "_leaked" for the name > > pointer. At least then we have a sane chance of figuring out what caused > > the problem, instead of having a bunch of "broken" entries in the table, > > and remove the above "broken" check entirely (we will always have a name). > > I don't like this because it complicates the code too much. > "broken" should be enough to debug it. Hmm, then you could just point to a static "broken" name at kmem_cache_destroy() time and save yourself the get_user() checks for each access to the name. This would gratuitously overwrite the name for non-modular caches that failed to unload, but I doubt that such things exist. Cheers, Andreas -- Andreas Dilger http://sourceforge.net/projects/ext2resize/ http://www-mddsp.enel.ucalgary.ca/People/adilger/ ^ permalink raw reply [flat|nested] 20+ messages in thread
* Re: [PATCH] Fix 2.5.3pre reiserfs BUG() at boot time 2002-01-25 22:15 ` Andi Kleen 2002-01-25 22:32 ` eth0: NULL pointer encountered in RX ring, skipping Andrea Ferraris 2002-01-25 22:41 ` [PATCH] Fix 2.5.3pre reiserfs BUG() at boot time Andreas Dilger @ 2002-01-26 7:24 ` Kai Henningsen 2002-01-27 23:02 ` Alessandro Suardi 3 siblings, 0 replies; 20+ messages in thread From: Kai Henningsen @ 2002-01-26 7:24 UTC (permalink / raw) To: ak; +Cc: linux-kernel ak@suse.de (Andi Kleen) wrote on 25.01.02 in <20020125231555.A22583@wotan.suse.de>: > +/* Must match cache_sizes above. Out of line to keep cache footprint low. > */ +#define CN(x) { x, x ## "(DMA)" } > +static char cache_names[][2][18] = { > + CN("size-128"), > + CN("size-256"), > + CN("size-512"), What on earth is that ## for?! If that actually works, I strongly suspect that it is a bug in cpp. MfG Kai ^ permalink raw reply [flat|nested] 20+ messages in thread
* Re: [PATCH] Fix 2.5.3pre reiserfs BUG() at boot time 2002-01-25 22:15 ` Andi Kleen ` (2 preceding siblings ...) 2002-01-26 7:24 ` Kai Henningsen @ 2002-01-27 23:02 ` Alessandro Suardi 2002-01-28 0:01 ` Andi Kleen 3 siblings, 1 reply; 20+ messages in thread From: Alessandro Suardi @ 2002-01-27 23:02 UTC (permalink / raw) To: Andi Kleen; +Cc: Linus Torvalds, John Levon, linux-kernel, davej Andi Kleen wrote: > > On Fri, Jan 25, 2002 at 01:38:14PM -0700, Andreas Dilger wrote: > > So, what exactly does the above do now (hint: p and pc are both local > > so they cannot be referenced anywhere else)? It used to check that you > > weren't trying to add two caches with the same name. This isn't > > possible with caches from broken modules anymore as they have no name. > > I have fixed the loop now to check for names again. > > > When calling kmem_cache_destroy() on a non-empty slab we should just > > malloc some memory with the old cache name + "_leaked" for the name > > pointer. At least then we have a sane chance of figuring out what caused > > the problem, instead of having a bunch of "broken" entries in the table, > > and remove the above "broken" check entirely (we will always have a name). > > I don't like this because it complicates the code too much. > "broken" should be enough to debug it. > > New patch appended. Linus please apply if you didn't already. > > -Andi 2.5.3-pre5 + this patch still can't boot my system. I haven't had time to copy down oops at boot, will do if needed. Thanks & ciao, > Index: mm/slab.c > =================================================================== > RCS file: /cvs/linux/mm/slab.c,v > retrieving revision 1.66 > diff -u -u -r1.66 slab.c > --- mm/slab.c 2002/01/08 16:00:20 1.66 > +++ mm/slab.c 2002/01/25 22:14:40 > @@ -186,8 +186,6 @@ > * manages a cache. > */ > > -#define CACHE_NAMELEN 20 /* max name length for a slab cache */ > - > struct kmem_cache_s { > /* 1) each alloc & free */ > /* full, partial first, then free */ > @@ -225,7 +223,7 @@ > unsigned long failures; > > /* 3) cache creation/removal */ > - char name[CACHE_NAMELEN]; > + const char *name; > struct list_head next; > #ifdef CONFIG_SMP > /* 4) per-cpu data */ > @@ -335,6 +333,7 @@ > kmem_cache_t *cs_dmacachep; > } cache_sizes_t; > > +/* These are the default caches for kmalloc. Custom caches can have other sizes. */ > static cache_sizes_t cache_sizes[] = { > #if PAGE_SIZE == 4096 > { 32, NULL, NULL}, > @@ -353,6 +352,26 @@ > {131072, NULL, NULL}, > { 0, NULL, NULL} > }; > +/* Must match cache_sizes above. Out of line to keep cache footprint low. */ > +#define CN(x) { x, x ## "(DMA)" } > +static char cache_names[][2][18] = { > +#if PAGE_SIZE == 4096 > + CN("size-32"), > +#endif > + CN("size-64"), > + CN("size-128"), > + CN("size-256"), > + CN("size-512"), > + CN("size-1024"), > + CN("size-2048"), > + CN("size-4096"), > + CN("size-8192"), > + CN("size-16384"), > + CN("size-32768"), > + CN("size-65536"), > + CN("size-131072") > +}; > +#undef CN > > /* internal cache of cache description objs */ > static kmem_cache_t cache_cache = { > @@ -437,7 +456,6 @@ > void __init kmem_cache_sizes_init(void) > { > cache_sizes_t *sizes = cache_sizes; > - char name[20]; > /* > * Fragmentation resistance on low memory - only use bigger > * page orders on machines with more than 32MB of memory. > @@ -450,9 +468,9 @@ > * eliminates "false sharing". > * Note for systems short on memory removing the alignment will > * allow tighter packing of the smaller caches. */ > - sprintf(name,"size-%Zd",sizes->cs_size); > if (!(sizes->cs_cachep = > - kmem_cache_create(name, sizes->cs_size, > + kmem_cache_create(cache_names[sizes-cache_sizes][0], > + sizes->cs_size, > 0, SLAB_HWCACHE_ALIGN, NULL, NULL))) { > BUG(); > } > @@ -462,9 +480,10 @@ > offslab_limit = sizes->cs_size-sizeof(slab_t); > offslab_limit /= 2; > } > - sprintf(name, "size-%Zd(DMA)",sizes->cs_size); > - sizes->cs_dmacachep = kmem_cache_create(name, sizes->cs_size, 0, > - SLAB_CACHE_DMA|SLAB_HWCACHE_ALIGN, NULL, NULL); > + sizes->cs_dmacachep = kmem_cache_create( > + cache_names[sizes-cache_sizes][1], > + sizes->cs_size, 0, > + SLAB_CACHE_DMA|SLAB_HWCACHE_ALIGN, NULL, NULL); > if (!sizes->cs_dmacachep) > BUG(); > sizes++; > @@ -604,6 +623,11 @@ > * Cannot be called within a int, but can be interrupted. > * The @ctor is run when new pages are allocated by the cache > * and the @dtor is run before the pages are handed back. > + * > + * @name must be valid until the cache is destroyed. This implies that > + * the module calling this has to destroy the cache before getting > + * unloaded. > + * > * The flags are > * > * %SLAB_POISON - Poison the slab with a known test pattern (a5a5a5a5) > @@ -632,7 +656,6 @@ > * Sanity checks... these are all serious usage bugs. > */ > if ((!name) || > - ((strlen(name) >= CACHE_NAMELEN - 1)) || > in_interrupt() || > (size < BYTES_PER_WORD) || > (size > (1<<MAX_OBJ_ORDER)*PAGE_SIZE) || > @@ -797,8 +820,7 @@ > cachep->slabp_cache = kmem_find_general_cachep(slab_size,0); > cachep->ctor = ctor; > cachep->dtor = dtor; > - /* Copy name over so we don't have problems with unloaded modules */ > - strcpy(cachep->name, name); > + cachep->name = name; > > #ifdef CONFIG_SMP > if (g_cpucache_up) > @@ -811,10 +833,11 @@ > > list_for_each(p, &cache_chain) { > kmem_cache_t *pc = list_entry(p, kmem_cache_t, next); > - > - /* The name field is constant - no lock needed. */ > - if (!strcmp(pc->name, name)) > - BUG(); > + char tmp; > + if (get_user(tmp,pc->name)) > + continue; > + if (!strcmp(pc->name,name)) > + BUG(); > } > } > > @@ -1878,6 +1901,7 @@ > unsigned long num_objs; > unsigned long active_slabs = 0; > unsigned long num_slabs; > + const char *name; > cachep = list_entry(p, kmem_cache_t, next); > > spin_lock_irq(&cachep->spinlock); > @@ -1906,8 +1930,15 @@ > num_slabs+=active_slabs; > num_objs = num_slabs*cachep->num; > > + name = cachep->name; > + { > + char tmp; > + if (get_user(tmp, name)) > + name = "broken"; > + } > + > len += sprintf(page+len, "%-17s %6lu %6lu %6u %4lu %4lu %4u", > - cachep->name, active_objs, num_objs, cachep->objsize, > + name, active_objs, num_objs, cachep->objsize, > active_slabs, num_slabs, (1<<cachep->gfporder)); > > #if STATS --alessandro "this machine will, will not communicate these thoughts and the strain I am under be a world child, form a circle before we all go under" (Radiohead, "Street Spirit [fade out]") ^ permalink raw reply [flat|nested] 20+ messages in thread
* Re: [PATCH] Fix 2.5.3pre reiserfs BUG() at boot time 2002-01-27 23:02 ` Alessandro Suardi @ 2002-01-28 0:01 ` Andi Kleen 2002-01-28 11:07 ` Jens Axboe 2002-01-29 13:14 ` Alessandro Suardi 0 siblings, 2 replies; 20+ messages in thread From: Andi Kleen @ 2002-01-28 0:01 UTC (permalink / raw) To: Alessandro Suardi Cc: Andi Kleen, Linus Torvalds, John Levon, linux-kernel, davej On Mon, Jan 28, 2002 at 12:02:54AM +0100, Alessandro Suardi wrote: > > 2.5.3-pre5 + this patch still can't boot my system. I haven't had > time to copy down oops at boot, will do if needed. Please do. I cannot see anything in the patch that should prevent bootup though, so I would also recommend a make clean and recompile first just to make sure it isn't a broken build. -Andi ^ permalink raw reply [flat|nested] 20+ messages in thread
* Re: [PATCH] Fix 2.5.3pre reiserfs BUG() at boot time 2002-01-28 0:01 ` Andi Kleen @ 2002-01-28 11:07 ` Jens Axboe 2002-01-28 14:53 ` Andi Kleen 2002-01-29 13:14 ` Alessandro Suardi 1 sibling, 1 reply; 20+ messages in thread From: Jens Axboe @ 2002-01-28 11:07 UTC (permalink / raw) To: Andi Kleen Cc: Alessandro Suardi, Linus Torvalds, John Levon, linux-kernel, davej On Mon, Jan 28 2002, Andi Kleen wrote: > On Mon, Jan 28, 2002 at 12:02:54AM +0100, Alessandro Suardi wrote: > > > > 2.5.3-pre5 + this patch still can't boot my system. I haven't had > > time to copy down oops at boot, will do if needed. > > Please do. I cannot see anything in the patch that should prevent bootup > though, so I would also recommend a make clean and recompile first just > to make sure it isn't a broken build. Probably the kmem_cache_create 'name too long' bug that Viro pointed out to me. fs/reiserfs/super.c:init_inodecache(). Change the name passed to kmem_cache_create to something shorter. -- Jens Axboe ^ permalink raw reply [flat|nested] 20+ messages in thread
* Re: [PATCH] Fix 2.5.3pre reiserfs BUG() at boot time 2002-01-28 11:07 ` Jens Axboe @ 2002-01-28 14:53 ` Andi Kleen 2002-01-28 14:54 ` Jens Axboe 0 siblings, 1 reply; 20+ messages in thread From: Andi Kleen @ 2002-01-28 14:53 UTC (permalink / raw) To: Jens Axboe Cc: Andi Kleen, Alessandro Suardi, Linus Torvalds, John Levon, linux-kernel, davej On Mon, Jan 28, 2002 at 12:07:47PM +0100, Jens Axboe wrote: > On Mon, Jan 28 2002, Andi Kleen wrote: > > On Mon, Jan 28, 2002 at 12:02:54AM +0100, Alessandro Suardi wrote: > > > > > > 2.5.3-pre5 + this patch still can't boot my system. I haven't had > > > time to copy down oops at boot, will do if needed. > > > > Please do. I cannot see anything in the patch that should prevent bootup > > though, so I would also recommend a make clean and recompile first just > > to make sure it isn't a broken build. > > Probably the kmem_cache_create 'name too long' bug that Viro pointed out > to me. fs/reiserfs/super.c:init_inodecache(). Change the name passed to > kmem_cache_create to something shorter. The patch he tried removed the check of the name length ;) -Andi ^ permalink raw reply [flat|nested] 20+ messages in thread
* Re: [PATCH] Fix 2.5.3pre reiserfs BUG() at boot time 2002-01-28 14:53 ` Andi Kleen @ 2002-01-28 14:54 ` Jens Axboe 0 siblings, 0 replies; 20+ messages in thread From: Jens Axboe @ 2002-01-28 14:54 UTC (permalink / raw) To: Andi Kleen Cc: Alessandro Suardi, Linus Torvalds, John Levon, linux-kernel, davej On Mon, Jan 28 2002, Andi Kleen wrote: > On Mon, Jan 28, 2002 at 12:07:47PM +0100, Jens Axboe wrote: > > On Mon, Jan 28 2002, Andi Kleen wrote: > > > On Mon, Jan 28, 2002 at 12:02:54AM +0100, Alessandro Suardi wrote: > > > > > > > > 2.5.3-pre5 + this patch still can't boot my system. I haven't had > > > > time to copy down oops at boot, will do if needed. > > > > > > Please do. I cannot see anything in the patch that should prevent bootup > > > though, so I would also recommend a make clean and recompile first just > > > to make sure it isn't a broken build. > > > > Probably the kmem_cache_create 'name too long' bug that Viro pointed out > > to me. fs/reiserfs/super.c:init_inodecache(). Change the name passed to > > kmem_cache_create to something shorter. > > The patch he tried removed the check of the name length ;) Heh, my cover is blown -- I didn't read it :-) -- Jens Axboe ^ permalink raw reply [flat|nested] 20+ messages in thread
* Re: [PATCH] Fix 2.5.3pre reiserfs BUG() at boot time 2002-01-28 0:01 ` Andi Kleen 2002-01-28 11:07 ` Jens Axboe @ 2002-01-29 13:14 ` Alessandro Suardi 1 sibling, 0 replies; 20+ messages in thread From: Alessandro Suardi @ 2002-01-29 13:14 UTC (permalink / raw) To: Andi Kleen; +Cc: Linus Torvalds, John Levon, linux-kernel, davej Andi Kleen wrote: > > On Mon, Jan 28, 2002 at 12:02:54AM +0100, Alessandro Suardi wrote: > > > > 2.5.3-pre5 + this patch still can't boot my system. I haven't had > > time to copy down oops at boot, will do if needed. > > Please do. I cannot see anything in the patch that should prevent bootup > though, so I would also recommend a make clean and recompile first just > to make sure it isn't a broken build. I ended up away from email for a couple of days and saw -pre6; re-patched from 2.5.2 to -pre6, boot is okay . Thanks & ciao, --alessandro "this machine will, will not communicate these thoughts and the strain I am under be a world child, form a circle before we all go under" (Radiohead, "Street Spirit [fade out]") ^ permalink raw reply [flat|nested] 20+ messages in thread
end of thread, other threads:[~2002-01-29 13:13 UTC | newest] Thread overview: 20+ messages (download: mbox.gz follow: Atom feed -- links below jump to the message on this page -- 2002-01-25 17:28 [PATCH] Fix 2.5.3pre reiserfs BUG() at boot time Andi Kleen 2002-01-25 18:01 ` John Levon 2002-01-25 18:08 ` Linus Torvalds 2002-01-25 18:31 ` Andreas Dilger 2002-01-25 18:46 ` Hans Reiser 2002-01-25 19:49 ` Andi Kleen 2002-01-25 20:38 ` Andreas Dilger 2002-01-25 22:15 ` Andi Kleen 2002-01-25 22:32 ` eth0: NULL pointer encountered in RX ring, skipping Andrea Ferraris 2002-01-25 22:59 ` Jeff Garzik 2002-01-26 10:11 ` Andrea Ferraris 2002-01-26 15:24 ` OPS: " Andrea Ferraris 2002-01-25 22:41 ` [PATCH] Fix 2.5.3pre reiserfs BUG() at boot time Andreas Dilger 2002-01-26 7:24 ` Kai Henningsen 2002-01-27 23:02 ` Alessandro Suardi 2002-01-28 0:01 ` Andi Kleen 2002-01-28 11:07 ` Jens Axboe 2002-01-28 14:53 ` Andi Kleen 2002-01-28 14:54 ` Jens Axboe 2002-01-29 13:14 ` Alessandro Suardi
This is a public inbox, see mirroring instructions for how to clone and mirror all data and code used for this inbox