* Re: 2.6.23-rc1: BUG_ON in kmap_atomic_prot() [not found] ` <20070723190152.GA5755@martell.zuzino.mipt.ru> @ 2007-07-23 20:24 ` Andrew Morton 2007-07-23 20:40 ` Alexey Dobriyan 2007-07-24 10:01 ` Mike Galbraith 0 siblings, 2 replies; 31+ messages in thread From: Andrew Morton @ 2007-07-23 20:24 UTC (permalink / raw) To: Alexey Dobriyan; +Cc: Linus Torvalds, linux-kernel, netdev On Mon, 23 Jul 2007 23:01:52 +0400 Alexey Dobriyan <adobriyan@gmail.com> wrote: > On Mon, Jul 23, 2007 at 10:38:39PM +0400, Alexey Dobriyan wrote: > > Managed to hit BUG_ON() in kmap_atomic_prot() three times while doing > > nothing unusual for this box (two times it was under X, so I can't > > guarantee, one time while trying to reproduce via ./configure in gdb > > tarball) Yeah, I hit this several times a few days ago. Same story: it just randomly went splat in response to no obvious stimulus. Reported it to netdev, was greeted with stunned silence. > > Box has 2.5G of RAM. 2.6.22 was OK. > > > > [dives into framebuffer console setup for complete oops] > > kernel BUG at arch/i386/mm/highmem.c:38 > PREEMPT DEBUG_PAGEALLOC SLAB > EIP at kmap_atomic_prot+0x32/0x93 > get_page_from_freelist > __alloc_pages > cache_alloc_refill > cache_alloc_refill > kmem_cache_alloc > dst_alloc > dst_alloc > __ip_route_output_key > [some junk I don't trust] > > eax: 0000000c > ebx: 00000003 > ecx: c065efe0 > edx: 00000003 > edi: 00000163 > > > c010cc9b <kmap_atomic_prot>: > c010cc9b: 57 push %edi > c010cc9c: 56 push %esi > c010cc9d: 53 push %ebx > c010cc9e: 89 c6 mov %eax,%esi > c010cca0: 89 d3 mov %edx,%ebx > c010cca2: 89 cf mov %ecx,%edi > c010cca4: b8 01 00 00 00 mov $0x1,%eax > c010cca9: e8 dd 1b 00 00 call c010e88b <add_preempt_count> > c010ccae: e8 b1 ac 0e 00 call c01f7964 <debug_smp_processor_id> > c010ccb3: 6b c0 0d imul $0xd,%eax,%eax > c010ccb6: 8d 14 03 lea (%ebx,%eax,1),%edx > c010ccb9: 8d 04 95 00 00 00 00 lea 0x0(,%edx,4),%eax > c010ccc0: 8b 0d 30 a1 3e c0 mov 0xc03ea130,%ecx > c010ccc6: 29 c1 sub %eax,%ecx > c010ccc8: 83 39 00 cmpl $0x0,(%ecx) > c010cccb: 74 04 je c010ccd1 <kmap_atomic_prot+0x36> > c010cccd: 0f 0b ud2a I had more complete info: http://article.gmane.org/gmane.linux.network/66966 You're using DEBUG_PAGEALLOC, but I was not, so I think we can rule that out. I haven't worked out where that kmap_atomic() call is coming from yet. Both traces point up into the page allocator, but I _think_ that's stack gunk. ^ permalink raw reply [flat|nested] 31+ messages in thread
* Re: 2.6.23-rc1: BUG_ON in kmap_atomic_prot() 2007-07-23 20:24 ` 2.6.23-rc1: BUG_ON in kmap_atomic_prot() Andrew Morton @ 2007-07-23 20:40 ` Alexey Dobriyan 2007-07-23 21:01 ` Alexey Dobriyan 2007-07-24 10:01 ` Mike Galbraith 1 sibling, 1 reply; 31+ messages in thread From: Alexey Dobriyan @ 2007-07-23 20:40 UTC (permalink / raw) To: Andrew Morton; +Cc: Linus Torvalds, linux-kernel, netdev On Mon, Jul 23, 2007 at 01:24:31PM -0700, Andrew Morton wrote: > On Mon, 23 Jul 2007 23:01:52 +0400 > Alexey Dobriyan <adobriyan@gmail.com> wrote: > > > On Mon, Jul 23, 2007 at 10:38:39PM +0400, Alexey Dobriyan wrote: > > > Managed to hit BUG_ON() in kmap_atomic_prot() three times while doing > > > nothing unusual for this box (two times it was under X, so I can't > > > guarantee, one time while trying to reproduce via ./configure in gdb > > > tarball) > > Yeah, I hit this several times a few days ago. Same story: it just > randomly went splat in response to no obvious stimulus. Reported it to > netdev, was greeted with stunned silence. > > > > > Box has 2.5G of RAM. 2.6.22 was OK. > > > > > > [dives into framebuffer console setup for complete oops] > > > > kernel BUG at arch/i386/mm/highmem.c:38 > > PREEMPT DEBUG_PAGEALLOC SLAB > > EIP at kmap_atomic_prot+0x32/0x93 > > get_page_from_freelist > > __alloc_pages > > cache_alloc_refill > > cache_alloc_refill > > kmem_cache_alloc > > dst_alloc > > dst_alloc > > __ip_route_output_key > > [some junk I don't trust] > > > > eax: 0000000c > > ebx: 00000003 > > ecx: c065efe0 > > edx: 00000003 > > edi: 00000163 > > > > > > c010cc9b <kmap_atomic_prot>: > > c010cc9b: 57 push %edi > > c010cc9c: 56 push %esi > > c010cc9d: 53 push %ebx > > c010cc9e: 89 c6 mov %eax,%esi > > c010cca0: 89 d3 mov %edx,%ebx > > c010cca2: 89 cf mov %ecx,%edi > > c010cca4: b8 01 00 00 00 mov $0x1,%eax > > c010cca9: e8 dd 1b 00 00 call c010e88b <add_preempt_count> > > c010ccae: e8 b1 ac 0e 00 call c01f7964 <debug_smp_processor_id> > > c010ccb3: 6b c0 0d imul $0xd,%eax,%eax > > c010ccb6: 8d 14 03 lea (%ebx,%eax,1),%edx > > c010ccb9: 8d 04 95 00 00 00 00 lea 0x0(,%edx,4),%eax > > c010ccc0: 8b 0d 30 a1 3e c0 mov 0xc03ea130,%ecx > > c010ccc6: 29 c1 sub %eax,%ecx > > c010ccc8: 83 39 00 cmpl $0x0,(%ecx) > > c010cccb: 74 04 je c010ccd1 <kmap_atomic_prot+0x36> > > c010cccd: 0f 0b ud2a > > I had more complete info: http://article.gmane.org/gmane.linux.network/66966 > > You're using DEBUG_PAGEALLOC, but I was not, so I think we can rule that out. > > I haven't worked out where that kmap_atomic() call is coming from yet. > Both traces point up into the page allocator, but I _think_ that's stack > gunk. Ahh, you suspect networking. Here, setup is 2 cheap-ass 100Mb realtek 8139 NICs, one to campus network receiving ~20 junk packets per second, one gathering netconsole output and ssh to it, no conntracks and fancy stuff. [reboots with cables physically unplugged] ^ permalink raw reply [flat|nested] 31+ messages in thread
* Re: 2.6.23-rc1: BUG_ON in kmap_atomic_prot() 2007-07-23 20:40 ` Alexey Dobriyan @ 2007-07-23 21:01 ` Alexey Dobriyan 2007-07-23 21:11 ` Andrew Morton 0 siblings, 1 reply; 31+ messages in thread From: Alexey Dobriyan @ 2007-07-23 21:01 UTC (permalink / raw) To: Andrew Morton; +Cc: Linus Torvalds, linux-kernel, netdev On Tue, Jul 24, 2007 at 12:40:45AM +0400, Alexey Dobriyan wrote: > > I had more complete info: http://article.gmane.org/gmane.linux.network/66966 > > > > You're using DEBUG_PAGEALLOC, but I was not, so I think we can rule that out. > > > > I haven't worked out where that kmap_atomic() call is coming from yet. > > Both traces point up into the page allocator, but I _think_ that's stack > > gunk. > > Ahh, you suspect networking. > > Here, setup is 2 cheap-ass 100Mb realtek 8139 NICs, one to campus network > receiving ~20 junk packets per second, one gathering netconsole output > and ssh to it, no conntracks and fancy stuff. > > [reboots with cables physically unplugged] OK, I run gdb recompile, cat(1) every file in /usr/portage (shitload of small files) with both cables unplugged. It all went fine for ~5 minutes after that it crashed exactly same way after 10 secs after plugging one of them. ^ permalink raw reply [flat|nested] 31+ messages in thread
* Re: 2.6.23-rc1: BUG_ON in kmap_atomic_prot() 2007-07-23 21:01 ` Alexey Dobriyan @ 2007-07-23 21:11 ` Andrew Morton 2007-07-23 21:28 ` Linus Torvalds 2007-07-23 22:04 ` 2.6.23-rc1: BUG_ON in kmap_atomic_prot() Alexey Dobriyan 0 siblings, 2 replies; 31+ messages in thread From: Andrew Morton @ 2007-07-23 21:11 UTC (permalink / raw) To: Alexey Dobriyan; +Cc: Linus Torvalds, linux-kernel, netdev On Tue, 24 Jul 2007 01:01:53 +0400 Alexey Dobriyan <adobriyan@gmail.com> wrote: > On Tue, Jul 24, 2007 at 12:40:45AM +0400, Alexey Dobriyan wrote: > > > I had more complete info: http://article.gmane.org/gmane.linux.network/66966 > > > > > > You're using DEBUG_PAGEALLOC, but I was not, so I think we can rule that out. > > > > > > I haven't worked out where that kmap_atomic() call is coming from yet. > > > Both traces point up into the page allocator, but I _think_ that's stack > > > gunk. > > > > Ahh, you suspect networking. > > > > Here, setup is 2 cheap-ass 100Mb realtek 8139 NICs, one to campus network > > receiving ~20 junk packets per second, one gathering netconsole output > > and ssh to it, no conntracks and fancy stuff. > > > > [reboots with cables physically unplugged] > > OK, I run gdb recompile, cat(1) every file in /usr/portage (shitload of > small files) with both cables unplugged. It all went fine for ~5 minutes > after that it crashed exactly same way after 10 secs after plugging one > of them. It'd be nice to get a clean trace. Are you able to obtain the full trace with CONFIG_FRAME_POINTER=y? ^ permalink raw reply [flat|nested] 31+ messages in thread
* Re: 2.6.23-rc1: BUG_ON in kmap_atomic_prot() 2007-07-23 21:11 ` Andrew Morton @ 2007-07-23 21:28 ` Linus Torvalds 2007-07-23 21:37 ` Sam Ravnborg 2007-07-24 17:59 ` Adrian Bunk 2007-07-23 22:04 ` 2.6.23-rc1: BUG_ON in kmap_atomic_prot() Alexey Dobriyan 1 sibling, 2 replies; 31+ messages in thread From: Linus Torvalds @ 2007-07-23 21:28 UTC (permalink / raw) To: Andrew Morton; +Cc: Alexey Dobriyan, linux-kernel, netdev On Mon, 23 Jul 2007, Andrew Morton wrote: > > It'd be nice to get a clean trace. Are you able to obtain the full > trace with CONFIG_FRAME_POINTER=y? If you are talking about http://userweb.kernel.org/~akpm/dsc03659.jpg then I think that _is_ a full trace. It's certainly not very messy, and it seems accurate. It's just that inlining makes it much harder to see the call-graphs, but that's what inlining does.. For example, missing from the call graph is get_page_from_freelist -> buffered_rmqueue -> [ missing - inlined ] prep_new_page -> [ missing - inlined ] prep_zero_page -> [ missing - inlined ] clear_highpage -> [ missing - inlined ] kmap_atomic -> [ missing - tailcall ] kmap_atomic_prot but I'm pretty sure the call trace is good (and I'm also pretty sure gcc is overly aggressive at inlining, and that it causes us pain for debugging, but whatever) The earlier part of the trace looks fine too. The only odd part I see is the existence of "dput()" there, so maybe it's not *quite* clean and enabling frame pointers might get rid of a few bogus entries, but it looks pretty close to clean. Linus ^ permalink raw reply [flat|nested] 31+ messages in thread
* Re: 2.6.23-rc1: BUG_ON in kmap_atomic_prot() 2007-07-23 21:28 ` Linus Torvalds @ 2007-07-23 21:37 ` Sam Ravnborg 2007-07-24 17:59 ` Adrian Bunk 1 sibling, 0 replies; 31+ messages in thread From: Sam Ravnborg @ 2007-07-23 21:37 UTC (permalink / raw) To: Linus Torvalds; +Cc: Andrew Morton, Alexey Dobriyan, linux-kernel, netdev > > For example, missing from the call graph is > > get_page_from_freelist -> > buffered_rmqueue -> [ missing - inlined ] > prep_new_page -> [ missing - inlined ] > prep_zero_page -> [ missing - inlined ] > clear_highpage -> [ missing - inlined ] > kmap_atomic -> [ missing - tailcall ] > kmap_atomic_prot > > (and I'm also pretty sure gcc > is overly aggressive at inlining, and that it causes us pain for > debugging, but whatever) mm/page_alloc.c:static inline void prep_zero_page(struct page *page, int order, gfp_t gfp_flags) include/linux/highmem.h:static inline void clear_highpage(struct page *page) So at least two was explicit marked inline. Now if that made I change i dunno. Sam ^ permalink raw reply [flat|nested] 31+ messages in thread
* Re: 2.6.23-rc1: BUG_ON in kmap_atomic_prot() 2007-07-23 21:28 ` Linus Torvalds 2007-07-23 21:37 ` Sam Ravnborg @ 2007-07-24 17:59 ` Adrian Bunk 2007-07-24 18:14 ` Linus Torvalds 1 sibling, 1 reply; 31+ messages in thread From: Adrian Bunk @ 2007-07-24 17:59 UTC (permalink / raw) To: Linus Torvalds; +Cc: Andrew Morton, Alexey Dobriyan, linux-kernel, netdev On Mon, Jul 23, 2007 at 02:28:11PM -0700, Linus Torvalds wrote: > > > On Mon, 23 Jul 2007, Andrew Morton wrote: > > > > It'd be nice to get a clean trace. Are you able to obtain the full > > trace with CONFIG_FRAME_POINTER=y? > > If you are talking about > > http://userweb.kernel.org/~akpm/dsc03659.jpg > > then I think that _is_ a full trace. It's certainly not very messy, and it > seems accurate. It's just that inlining makes it much harder to see the > call-graphs, but that's what inlining does.. > > For example, missing from the call graph is > > get_page_from_freelist -> > buffered_rmqueue -> [ missing - inlined ] > prep_new_page -> [ missing - inlined ] > prep_zero_page -> [ missing - inlined ] > clear_highpage -> [ missing - inlined ] > kmap_atomic -> [ missing - tailcall ] > kmap_atomic_prot > > but I'm pretty sure the call trace is good (and I'm also pretty sure gcc > is overly aggressive at inlining, and that it causes us pain for > debugging, but whatever) >... For prep_zero_page() and clear_highpage() we can't blame gcc since we force gcc to always inline them. buffered_rmqueue() and prep_new_page() are static functions with only one caller each, and for the normal non-debug case it's a really nice optimization to have them inlined automatically. But it might make sense to add -fno-inline-functions-called-once to the CFLAGS depending on some debug option? > Linus cu Adrian -- "Is there not promise of rain?" Ling Tan asked suddenly out of the darkness. There had been need of rain for many days. "Only a promise," Lao Er said. Pearl S. Buck - Dragon Seed ^ permalink raw reply [flat|nested] 31+ messages in thread
* Re: 2.6.23-rc1: BUG_ON in kmap_atomic_prot() 2007-07-24 17:59 ` Adrian Bunk @ 2007-07-24 18:14 ` Linus Torvalds 2007-07-24 18:28 ` Andrew Morton 2007-07-26 6:09 ` commit 7e92b4fc34 - x86, serial: convert legacy COM ports to platform devices - broke my serial console H. Peter Anvin 0 siblings, 2 replies; 31+ messages in thread From: Linus Torvalds @ 2007-07-24 18:14 UTC (permalink / raw) To: Adrian Bunk; +Cc: Andrew Morton, Alexey Dobriyan, linux-kernel, netdev On Tue, 24 Jul 2007, Adrian Bunk wrote: > > buffered_rmqueue() and prep_new_page() are static functions with only > one caller each, and for the normal non-debug case it's a really nice > optimization to have them inlined automatically. I'm not at all sure I agree. Inlining big functions doesn't actually tend to generally generate any better code, so if gcc's logic is "single callsite - always inline", then that logic is likely not right. Linus ^ permalink raw reply [flat|nested] 31+ messages in thread
* Re: 2.6.23-rc1: BUG_ON in kmap_atomic_prot() 2007-07-24 18:14 ` Linus Torvalds @ 2007-07-24 18:28 ` Andrew Morton 2007-07-24 19:15 ` Linus Torvalds 2007-07-26 6:09 ` commit 7e92b4fc34 - x86, serial: convert legacy COM ports to platform devices - broke my serial console H. Peter Anvin 1 sibling, 1 reply; 31+ messages in thread From: Andrew Morton @ 2007-07-24 18:28 UTC (permalink / raw) To: Linus Torvalds; +Cc: Adrian Bunk, Alexey Dobriyan, linux-kernel, netdev On Tue, 24 Jul 2007 11:14:21 -0700 (PDT) Linus Torvalds <torvalds@linux-foundation.org> wrote: > > > On Tue, 24 Jul 2007, Adrian Bunk wrote: > > > > buffered_rmqueue() and prep_new_page() are static functions with only > > one caller each, and for the normal non-debug case it's a really nice > > optimization to have them inlined automatically. > > I'm not at all sure I agree. > > Inlining big functions doesn't actually tend to generally generate any > better code, so if gcc's logic is "single callsite - always inline", then > that logic is likely not right. > fwiw, -fno-inline-functions-called-once (who knew?) takes i386 allnoconfig vmlinux .text from 928360 up to 955362 bytes (27k larger). A surprisingly large increase - I wonder if it did something dumb. It appears to still correctly inline those things which we've manually marked inline. hm. It would be nice to defeat the autoinlining for debug purposes though. ^ permalink raw reply [flat|nested] 31+ messages in thread
* Re: 2.6.23-rc1: BUG_ON in kmap_atomic_prot() 2007-07-24 18:28 ` Andrew Morton @ 2007-07-24 19:15 ` Linus Torvalds 2007-07-24 19:40 ` Adrian Bunk 2007-07-24 20:27 ` Andi Kleen 0 siblings, 2 replies; 31+ messages in thread From: Linus Torvalds @ 2007-07-24 19:15 UTC (permalink / raw) To: Andrew Morton; +Cc: Adrian Bunk, Alexey Dobriyan, linux-kernel, netdev On Tue, 24 Jul 2007, Andrew Morton wrote: > > fwiw, -fno-inline-functions-called-once (who knew?) takes i386 allnoconfig > vmlinux .text from 928360 up to 955362 bytes (27k larger). > > A surprisingly large increase - I wonder if it did something dumb. It > appears to still correctly inline those things which we've manually marked > inline. hm. I think inlining small enough functions is worth it, and the thing is, the kernel is actually pretty damn good at having lots of small functions. It's one of the few things I really care about from a coding style standpoint. So I'm not surprised that "-fno-inline-functions-called-once" makes things larger, because I think it's generally a good idea to inline things that are just called once. But it does make things harder to debug, and the performance advantages become increasingly small for bigger functions. And that's a balancing act. Do we care about performance? Yes. But do we care so much that it's worth inlining something like buffered_rmqueue()? So I would not be surprised if "-fno-inline-functions-called-once" will disable *all* the inlining heuristics, and say "oh, it's not an inline function, and it's only called once, so we won't inline it at all". So "called once" should probably make the inlining weight bigger (ie inline *larger* functions than you would otherwise), it just shouldn't make it "infinite". It's not worth it. Linus ^ permalink raw reply [flat|nested] 31+ messages in thread
* Re: 2.6.23-rc1: BUG_ON in kmap_atomic_prot() 2007-07-24 19:15 ` Linus Torvalds @ 2007-07-24 19:40 ` Adrian Bunk 2007-07-24 19:48 ` Linus Torvalds 2007-07-24 20:27 ` Andi Kleen 1 sibling, 1 reply; 31+ messages in thread From: Adrian Bunk @ 2007-07-24 19:40 UTC (permalink / raw) To: Linus Torvalds; +Cc: Andrew Morton, Alexey Dobriyan, linux-kernel, netdev On Tue, Jul 24, 2007 at 12:15:48PM -0700, Linus Torvalds wrote: > > > On Tue, 24 Jul 2007, Andrew Morton wrote: > > > > fwiw, -fno-inline-functions-called-once (who knew?) takes i386 allnoconfig > > vmlinux .text from 928360 up to 955362 bytes (27k larger). > > > > A surprisingly large increase - I wonder if it did something dumb. It > > appears to still correctly inline those things which we've manually marked > > inline. hm. > > I think inlining small enough functions is worth it, and the thing is, the > kernel is actually pretty damn good at having lots of small functions. > It's one of the few things I really care about from a coding style > standpoint. > > So I'm not surprised that "-fno-inline-functions-called-once" makes things > larger, because I think it's generally a good idea to inline things that > are just called once. But it does make things harder to debug, and the > performance advantages become increasingly small for bigger functions. > > And that's a balancing act. Do we care about performance? Yes. When using CONFIG_CC_OPTIMIZE_FOR_SIZE=y we even actively tell gcc that we only care about size and do not care about performance... > But do we > care so much that it's worth inlining something like buffered_rmqueue()? >... Where is the problem with having buffered_rmqueue() inlined? > Linus cu Adrian -- "Is there not promise of rain?" Ling Tan asked suddenly out of the darkness. There had been need of rain for many days. "Only a promise," Lao Er said. Pearl S. Buck - Dragon Seed ^ permalink raw reply [flat|nested] 31+ messages in thread
* Re: 2.6.23-rc1: BUG_ON in kmap_atomic_prot() 2007-07-24 19:40 ` Adrian Bunk @ 2007-07-24 19:48 ` Linus Torvalds 2007-07-26 18:07 ` Adrian Bunk 0 siblings, 1 reply; 31+ messages in thread From: Linus Torvalds @ 2007-07-24 19:48 UTC (permalink / raw) To: Adrian Bunk; +Cc: Andrew Morton, Alexey Dobriyan, linux-kernel, netdev On Tue, 24 Jul 2007, Adrian Bunk wrote: > > > But do we > > care so much that it's worth inlining something like buffered_rmqueue()? > >... > > Where is the problem with having buffered_rmqueue() inlined? In this case, it was a pain to just even try to find the call chain, or read the asm. I would encourage lots of kernel hackers to read the assembler code gcc generates. I suspect people being aware of code generation issues (and writing their code with that in mind) is a *much* bigger performance impact than gcc inlining random functions. So maybe I'm old-fashioned and crazy, but "readability of the asm result" actually is a worthwhile goal. Not because we care directly, but because I'd like to encourage people to do it, due to the *indirect* benefits. Linus ^ permalink raw reply [flat|nested] 31+ messages in thread
* Re: 2.6.23-rc1: BUG_ON in kmap_atomic_prot() 2007-07-24 19:48 ` Linus Torvalds @ 2007-07-26 18:07 ` Adrian Bunk 2007-07-26 18:19 ` Linus Torvalds 0 siblings, 1 reply; 31+ messages in thread From: Adrian Bunk @ 2007-07-26 18:07 UTC (permalink / raw) To: Linus Torvalds; +Cc: Andrew Morton, Alexey Dobriyan, linux-kernel, netdev On Tue, Jul 24, 2007 at 12:48:13PM -0700, Linus Torvalds wrote: > > > On Tue, 24 Jul 2007, Adrian Bunk wrote: > > > > > But do we > > > care so much that it's worth inlining something like buffered_rmqueue()? > > >... > > > > Where is the problem with having buffered_rmqueue() inlined? > > In this case, it was a pain to just even try to find the call chain, or > read the asm. Optimization versus debugging is a common issue... As I said, it might make sense to disable this optimization depending on some debugging option. > I would encourage lots of kernel hackers to read the assembler code gcc > generates. I suspect people being aware of code generation issues (and > writing their code with that in mind) is a *much* bigger performance > impact than gcc inlining random functions. > > So maybe I'm old-fashioned and crazy, but "readability of the asm result" > actually is a worthwhile goal. Not because we care directly, but because > I'd like to encourage people to do it, due to the *indirect* benefits. This would lead to people trying to optimize code for one gcc version - and the code might stay this way for 10 years. People should write readable C code. This also has the best chances of resulting in good performance with the next gcc version on the next generation hardware. > Linus cu Adrian -- "Is there not promise of rain?" Ling Tan asked suddenly out of the darkness. There had been need of rain for many days. "Only a promise," Lao Er said. Pearl S. Buck - Dragon Seed ^ permalink raw reply [flat|nested] 31+ messages in thread
* Re: 2.6.23-rc1: BUG_ON in kmap_atomic_prot() 2007-07-26 18:07 ` Adrian Bunk @ 2007-07-26 18:19 ` Linus Torvalds 0 siblings, 0 replies; 31+ messages in thread From: Linus Torvalds @ 2007-07-26 18:19 UTC (permalink / raw) To: Adrian Bunk; +Cc: Andrew Morton, Alexey Dobriyan, linux-kernel, netdev On Thu, 26 Jul 2007, Adrian Bunk wrote: > > > > So maybe I'm old-fashioned and crazy, but "readability of the asm result" > > actually is a worthwhile goal. Not because we care directly, but because > > I'd like to encourage people to do it, due to the *indirect* benefits. > > This would lead to people trying to optimize code for one gcc version - > and the code might stay this way for 10 years. No. The fact is, code that is easy to optimize is easy to optimize. It has _nothing_ to do with gcc versions. Linus ^ permalink raw reply [flat|nested] 31+ messages in thread
* Re: 2.6.23-rc1: BUG_ON in kmap_atomic_prot() 2007-07-24 19:15 ` Linus Torvalds 2007-07-24 19:40 ` Adrian Bunk @ 2007-07-24 20:27 ` Andi Kleen 2007-07-24 19:45 ` Linus Torvalds 1 sibling, 1 reply; 31+ messages in thread From: Andi Kleen @ 2007-07-24 20:27 UTC (permalink / raw) To: Linus Torvalds Cc: Andrew Morton, Adrian Bunk, Alexey Dobriyan, linux-kernel, netdev Linus Torvalds <torvalds@linux-foundation.org> writes: > > So "called once" should probably make the inlining weight bigger (ie > inline *larger* functions than you would otherwise), it just shouldn't > make it "infinite". It's not worth it. There's probably a --param where it can be tweaked exactly. The problem is that --params tend to be very gcc version specific and might do something completely different on a newer or older version. So it's better not to use them. -Andi ^ permalink raw reply [flat|nested] 31+ messages in thread
* Re: 2.6.23-rc1: BUG_ON in kmap_atomic_prot() 2007-07-24 20:27 ` Andi Kleen @ 2007-07-24 19:45 ` Linus Torvalds 0 siblings, 0 replies; 31+ messages in thread From: Linus Torvalds @ 2007-07-24 19:45 UTC (permalink / raw) To: Andi Kleen Cc: Andrew Morton, Adrian Bunk, Alexey Dobriyan, linux-kernel, netdev On Tue, 24 Jul 2007, Andi Kleen wrote: > > There's probably a --param where it can be tweaked exactly. The > problem is that --params tend to be very gcc version specific > and might do something completely different on a newer or > older version. So it's better not to use them. I agree wholeheartedly with that sentiment. We've tried at times (because some gcc snapshots made some *truly* insane choices for a while), and maybe we still have some around. Not worth the pain. Linus ^ permalink raw reply [flat|nested] 31+ messages in thread
* Re: commit 7e92b4fc34 - x86, serial: convert legacy COM ports to platform devices - broke my serial console 2007-07-24 18:14 ` Linus Torvalds 2007-07-24 18:28 ` Andrew Morton @ 2007-07-26 6:09 ` H. Peter Anvin 1 sibling, 0 replies; 31+ messages in thread From: H. Peter Anvin @ 2007-07-26 6:09 UTC (permalink / raw) To: Jeff Garzik Cc: Adrian Bunk, Andrew Morton, Alexey Dobriyan, linux-kernel, netdev Jeff Garzik wrote: > > On Tue, 24 Jul 2007, Adrian Bunk wrote: >> buffered_rmqueue() and prep_new_page() are static functions with only >> one caller each, and for the normal non-debug case it's a really nice >> optimization to have them inlined automatically. > > I'm not at all sure I agree. > > Inlining big functions doesn't actually tend to generally generate any > better code, so if gcc's logic is "single callsite - always inline", then > that logic is likely not right. > Only up to a threshold, as far as I know. -hpa ^ permalink raw reply [flat|nested] 31+ messages in thread
* Re: 2.6.23-rc1: BUG_ON in kmap_atomic_prot() 2007-07-23 21:11 ` Andrew Morton 2007-07-23 21:28 ` Linus Torvalds @ 2007-07-23 22:04 ` Alexey Dobriyan 2007-07-23 22:27 ` Andrew Morton 1 sibling, 1 reply; 31+ messages in thread From: Alexey Dobriyan @ 2007-07-23 22:04 UTC (permalink / raw) To: Andrew Morton; +Cc: Linus Torvalds, linux-kernel, netdev On Mon, Jul 23, 2007 at 02:11:37PM -0700, Andrew Morton wrote: > On Tue, 24 Jul 2007 01:01:53 +0400 > Alexey Dobriyan <adobriyan@gmail.com> wrote: > > > On Tue, Jul 24, 2007 at 12:40:45AM +0400, Alexey Dobriyan wrote: > > > > I had more complete info: http://article.gmane.org/gmane.linux.network/66966 > > > > > > > > You're using DEBUG_PAGEALLOC, but I was not, so I think we can rule that out. > > > > > > > > I haven't worked out where that kmap_atomic() call is coming from yet. > > > > Both traces point up into the page allocator, but I _think_ that's stack > > > > gunk. > > > > > > Ahh, you suspect networking. > > > > > > Here, setup is 2 cheap-ass 100Mb realtek 8139 NICs, one to campus network > > > receiving ~20 junk packets per second, one gathering netconsole output > > > and ssh to it, no conntracks and fancy stuff. > > > > > > [reboots with cables physically unplugged] > > > > OK, I run gdb recompile, cat(1) every file in /usr/portage (shitload of > > small files) with both cables unplugged. It all went fine for ~5 minutes > > after that it crashed exactly same way after 10 secs after plugging one > > of them. > > It'd be nice to get a clean trace. Are you able to obtain the full > trace with CONFIG_FRAME_POINTER=y? Sorry, no camera shot, finding camera requires wakening up M. :) It took longer that usual, but here it is kmap_atomic get_page_from_freelist __alloc_pages cache_alloc_refill __alloc_pages cache_alloc_refill kmem_cache_alloc dst_alloc ip_route_input ip_rcv netif_receive_skb rtl8139_poll net_rx_action __do_softirq do_softirq irq_exit do_IRQ common_interrupt handle_mm_fault do_page_fault error_core much more loaded x86_64 box near also running 2.6.23-rc1 with debugging turned on, using atl1 driver doesn't experience any crashes. And I found 2.6.22-b91cba52e9b7b3f1c0037908a192d93a869ca9e5-x entry on top of grub config which means b91cba52e9b7b3f1c0037908a192d93a869ca9e5 _without_ any debugging was OK. ^ permalink raw reply [flat|nested] 31+ messages in thread
* Re: 2.6.23-rc1: BUG_ON in kmap_atomic_prot() 2007-07-23 22:04 ` 2.6.23-rc1: BUG_ON in kmap_atomic_prot() Alexey Dobriyan @ 2007-07-23 22:27 ` Andrew Morton 2007-07-24 5:20 ` Alexey Dobriyan 2007-07-24 8:17 ` Jens Axboe 0 siblings, 2 replies; 31+ messages in thread From: Andrew Morton @ 2007-07-23 22:27 UTC (permalink / raw) To: Alexey Dobriyan; +Cc: Linus Torvalds, linux-kernel, netdev On Tue, 24 Jul 2007 02:04:46 +0400 Alexey Dobriyan <adobriyan@gmail.com> wrote: > On Mon, Jul 23, 2007 at 02:11:37PM -0700, Andrew Morton wrote: > > On Tue, 24 Jul 2007 01:01:53 +0400 > > Alexey Dobriyan <adobriyan@gmail.com> wrote: > > > > > On Tue, Jul 24, 2007 at 12:40:45AM +0400, Alexey Dobriyan wrote: > > > > > I had more complete info: http://article.gmane.org/gmane.linux.network/66966 > > > > > > > > > > You're using DEBUG_PAGEALLOC, but I was not, so I think we can rule that out. > > > > > > > > > > I haven't worked out where that kmap_atomic() call is coming from yet. > > > > > Both traces point up into the page allocator, but I _think_ that's stack > > > > > gunk. > > > > > > > > Ahh, you suspect networking. > > > > > > > > Here, setup is 2 cheap-ass 100Mb realtek 8139 NICs, one to campus network > > > > receiving ~20 junk packets per second, one gathering netconsole output > > > > and ssh to it, no conntracks and fancy stuff. > > > > > > > > [reboots with cables physically unplugged] > > > > > > OK, I run gdb recompile, cat(1) every file in /usr/portage (shitload of > > > small files) with both cables unplugged. It all went fine for ~5 minutes > > > after that it crashed exactly same way after 10 secs after plugging one > > > of them. > > > > It'd be nice to get a clean trace. Are you able to obtain the full > > trace with CONFIG_FRAME_POINTER=y? > > Sorry, no camera shot, finding camera requires wakening up M. :) > > It took longer that usual, but here it is > > kmap_atomic > get_page_from_freelist > __alloc_pages > cache_alloc_refill > __alloc_pages > cache_alloc_refill > kmem_cache_alloc > dst_alloc > ip_route_input > ip_rcv > netif_receive_skb > rtl8139_poll > net_rx_action > __do_softirq > do_softirq > irq_exit > do_IRQ > common_interrupt > handle_mm_fault > do_page_fault > error_core > > much more loaded x86_64 box near also running 2.6.23-rc1 with debugging > turned on, using atl1 driver doesn't experience any crashes. > > And I found 2.6.22-b91cba52e9b7b3f1c0037908a192d93a869ca9e5-x entry on > top of grub config which means b91cba52e9b7b3f1c0037908a192d93a869ca9e5 > _without_ any debugging was OK. I worked out that the crash I saw was in BUG_ON(!pte_none(*(kmap_pte-idx))); in the read of kmap_pte[idx]. Which would be weird as the caller is using a literal KM_USER0. So maybe I goofed, and that BUG_ON is triggering (it scrolled off, and I am unable to reproduce it now). If that BUG_ON _is_ triggering then it might indicate that someone is doing a __GFP_HIGHMEM|__GFP_ZERO allocation while holding KM_USER0. If they're holding an atomic kmap then they'll be running in_atomic so it is unlikely that they accidentally added __GFP_WAIT because lots of people would be getting lots of might_sleep() warnings. Hence that first VM_BUG_ON in prep_zero_page() _should_ be triggering. Do you have CONFIG_DEBUG_VM enabled? Also, it might be useful to apply -mm's kmap_atomic-debugging.patch. it will detect lots of abuse. ^ permalink raw reply [flat|nested] 31+ messages in thread
* Re: 2.6.23-rc1: BUG_ON in kmap_atomic_prot() 2007-07-23 22:27 ` Andrew Morton @ 2007-07-24 5:20 ` Alexey Dobriyan 2007-07-24 8:17 ` Jens Axboe 1 sibling, 0 replies; 31+ messages in thread From: Alexey Dobriyan @ 2007-07-24 5:20 UTC (permalink / raw) To: Andrew Morton; +Cc: Linus Torvalds, linux-kernel, netdev On Mon, Jul 23, 2007 at 03:27:12PM -0700, Andrew Morton wrote: > On Tue, 24 Jul 2007 02:04:46 +0400 > Alexey Dobriyan <adobriyan@gmail.com> wrote: > > > On Mon, Jul 23, 2007 at 02:11:37PM -0700, Andrew Morton wrote: > > > On Tue, 24 Jul 2007 01:01:53 +0400 > > > Alexey Dobriyan <adobriyan@gmail.com> wrote: > > > > > > > On Tue, Jul 24, 2007 at 12:40:45AM +0400, Alexey Dobriyan wrote: > > > > > > I had more complete info: http://article.gmane.org/gmane.linux.network/66966 > > > > > > > > > > > > You're using DEBUG_PAGEALLOC, but I was not, so I think we can rule that out. > > > > > > > > > > > > I haven't worked out where that kmap_atomic() call is coming from yet. > > > > > > Both traces point up into the page allocator, but I _think_ that's stack > > > > > > gunk. > > > > > > > > > > Ahh, you suspect networking. > > > > > > > > > > Here, setup is 2 cheap-ass 100Mb realtek 8139 NICs, one to campus network > > > > > receiving ~20 junk packets per second, one gathering netconsole output > > > > > and ssh to it, no conntracks and fancy stuff. > > > > > > > > > > [reboots with cables physically unplugged] > > > > > > > > OK, I run gdb recompile, cat(1) every file in /usr/portage (shitload of > > > > small files) with both cables unplugged. It all went fine for ~5 minutes > > > > after that it crashed exactly same way after 10 secs after plugging one > > > > of them. > > > > > > It'd be nice to get a clean trace. Are you able to obtain the full > > > trace with CONFIG_FRAME_POINTER=y? > > > > Sorry, no camera shot, finding camera requires wakening up M. :) > > > > It took longer that usual, but here it is > > > > kmap_atomic > > get_page_from_freelist > > __alloc_pages > > cache_alloc_refill > > __alloc_pages > > cache_alloc_refill > > kmem_cache_alloc > > dst_alloc > > ip_route_input > > ip_rcv > > netif_receive_skb > > rtl8139_poll > > net_rx_action > > __do_softirq > > do_softirq > > irq_exit > > do_IRQ > > common_interrupt > > handle_mm_fault > > do_page_fault > > error_core > > > > much more loaded x86_64 box near also running 2.6.23-rc1 with debugging > > turned on, using atl1 driver doesn't experience any crashes. > > > > And I found 2.6.22-b91cba52e9b7b3f1c0037908a192d93a869ca9e5-x entry on > > top of grub config which means b91cba52e9b7b3f1c0037908a192d93a869ca9e5 > > _without_ any debugging was OK. > > I worked out that the crash I saw was in > > BUG_ON(!pte_none(*(kmap_pte-idx))); > > in the read of kmap_pte[idx]. Which would be weird as the caller is using > a literal KM_USER0. > > So maybe I goofed, and that BUG_ON is triggering (it scrolled off, and I am > unable to reproduce it now). > > If that BUG_ON _is_ triggering then it might indicate that someone is doing > a __GFP_HIGHMEM|__GFP_ZERO allocation while holding KM_USER0. > > If they're holding an atomic kmap then they'll be running in_atomic so it > is unlikely that they accidentally added __GFP_WAIT because lots of people > would be getting lots of might_sleep() warnings. > > Hence that first VM_BUG_ON in prep_zero_page() _should_ be triggering. > > Do you have CONFIG_DEBUG_VM enabled? Yes. > Also, it might be useful to apply -mm's kmap_atomic-debugging.patch. it > will detect lots of abuse. I hit it only once with this patch applied, but there were no additional warnings. ^ permalink raw reply [flat|nested] 31+ messages in thread
* Re: 2.6.23-rc1: BUG_ON in kmap_atomic_prot() 2007-07-23 22:27 ` Andrew Morton 2007-07-24 5:20 ` Alexey Dobriyan @ 2007-07-24 8:17 ` Jens Axboe 2007-07-24 8:22 ` Jens Axboe 1 sibling, 1 reply; 31+ messages in thread From: Jens Axboe @ 2007-07-24 8:17 UTC (permalink / raw) To: Andrew Morton Cc: Alexey Dobriyan, Linus Torvalds, linux-kernel, netdev, mark.fasheh On Mon, Jul 23 2007, Andrew Morton wrote: > I worked out that the crash I saw was in > > BUG_ON(!pte_none(*(kmap_pte-idx))); > > in the read of kmap_pte[idx]. Which would be weird as the caller is using > a literal KM_USER0. > > So maybe I goofed, and that BUG_ON is triggering (it scrolled off, and I am > unable to reproduce it now). > > If that BUG_ON _is_ triggering then it might indicate that someone is doing > a __GFP_HIGHMEM|__GFP_ZERO allocation while holding KM_USER0. Or doing double kunmaps, or doing a kunmap_atomic() on the page, not the address. I've seen both of those end up triggering that BUG_ON() in a later kmap. Looking over the 2.6.22..2.6.23-rc1 diff, I found one such error in ocfs2 at least. But you are probably not using that, so I'll keep looking... --- [PATCH] ocfs2: bad kunmap_atomic() kunmap_atomic() takes the virtual address, not the mapped page as argument. Signed-off-by: Jens Axboe <jens.axboe@oracle.com> diff --git a/fs/ocfs2/file.c b/fs/ocfs2/file.c index 5727cd1..c4034f6 100644 --- a/fs/ocfs2/file.c +++ b/fs/ocfs2/file.c @@ -2153,7 +2153,7 @@ static int ocfs2_splice_write_actor(struct pipe_inode_info *pipe, src = buf->ops->map(pipe, buf, 1); dst = kmap_atomic(page, KM_USER1); memcpy(dst + offset, src + buf->offset, count); - kunmap_atomic(page, KM_USER1); + kunmap_atomic(dst, KM_USER1); buf->ops->unmap(pipe, buf, src); copied = ocfs2_write_end(file, file->f_mapping, sd->pos, count, count, -- Jens Axboe ^ permalink raw reply related [flat|nested] 31+ messages in thread
* Re: 2.6.23-rc1: BUG_ON in kmap_atomic_prot() 2007-07-24 8:17 ` Jens Axboe @ 2007-07-24 8:22 ` Jens Axboe 2007-07-24 8:34 ` Andrew Morton 2007-07-24 13:55 ` Dan Williams 0 siblings, 2 replies; 31+ messages in thread From: Jens Axboe @ 2007-07-24 8:22 UTC (permalink / raw) To: Andrew Morton Cc: Alexey Dobriyan, Linus Torvalds, linux-kernel, netdev, mark.fasheh, dan.j.williams On Tue, Jul 24 2007, Jens Axboe wrote: > On Mon, Jul 23 2007, Andrew Morton wrote: > > I worked out that the crash I saw was in > > > > BUG_ON(!pte_none(*(kmap_pte-idx))); > > > > in the read of kmap_pte[idx]. Which would be weird as the caller is using > > a literal KM_USER0. > > > > So maybe I goofed, and that BUG_ON is triggering (it scrolled off, and I am > > unable to reproduce it now). > > > > If that BUG_ON _is_ triggering then it might indicate that someone is doing > > a __GFP_HIGHMEM|__GFP_ZERO allocation while holding KM_USER0. > > Or doing double kunmaps, or doing a kunmap_atomic() on the page, not the > address. I've seen both of those end up triggering that BUG_ON() in a > later kmap. > > Looking over the 2.6.22..2.6.23-rc1 diff, I found one such error in > ocfs2 at least. But you are probably not using that, so I'll keep > looking... What about the new async crypto stuff? I've been looking, but is it guarenteed that async_memcpy() runs in process context with interrupts enabled always? If not, there's a km type bug there. In general, I think the highmem stuff could do with more safety checks: - People ALWAYS get the atomic unmaps wrong, passing in the page instead of the address. I've seen tons of these. And since kunmap_atomic() takes a void pointer, nobody notices until it goes boom. - People easily get the km type wrong - they use KM_USERx in interrupt context, or one of the irq variants without disabling interrupts. If we could just catch these two types of bugs, we've got a lot of these problems covered. -- Jens Axboe ^ permalink raw reply [flat|nested] 31+ messages in thread
* Re: 2.6.23-rc1: BUG_ON in kmap_atomic_prot() 2007-07-24 8:22 ` Jens Axboe @ 2007-07-24 8:34 ` Andrew Morton 2007-07-24 14:00 ` Dan Williams 2007-07-24 13:55 ` Dan Williams 1 sibling, 1 reply; 31+ messages in thread From: Andrew Morton @ 2007-07-24 8:34 UTC (permalink / raw) To: Jens Axboe Cc: Alexey Dobriyan, Linus Torvalds, linux-kernel, netdev, mark.fasheh, dan.j.williams, Nelson, Shannon On Tue, 24 Jul 2007 10:22:07 +0200 Jens Axboe <jens.axboe@oracle.com> wrote: > On Tue, Jul 24 2007, Jens Axboe wrote: > > On Mon, Jul 23 2007, Andrew Morton wrote: > > > I worked out that the crash I saw was in > > > > > > BUG_ON(!pte_none(*(kmap_pte-idx))); > > > > > > in the read of kmap_pte[idx]. Which would be weird as the caller is using > > > a literal KM_USER0. > > > > > > So maybe I goofed, and that BUG_ON is triggering (it scrolled off, and I am > > > unable to reproduce it now). > > > > > > If that BUG_ON _is_ triggering then it might indicate that someone is doing > > > a __GFP_HIGHMEM|__GFP_ZERO allocation while holding KM_USER0. > > > > Or doing double kunmaps, or doing a kunmap_atomic() on the page, not the > > address. I've seen both of those end up triggering that BUG_ON() in a > > later kmap. > > > > Looking over the 2.6.22..2.6.23-rc1 diff, I found one such error in > > ocfs2 at least. But you are probably not using that, so I'll keep > > looking... > > What about the new async crypto stuff? I've been looking, but is it > guarenteed that async_memcpy() runs in process context with interrupts > enabled always? If not, there's a km type bug there. I think Shannon maintains that now. > In general, I think the highmem stuff could do with more safety checks: > > - People ALWAYS get the atomic unmaps wrong, passing in the page instead > of the address. I've seen tons of these. And since kunmap_atomic() > takes a void pointer, nobody notices until it goes boom. yeah, it's a real trap. For a while I had a patch which converted kmap_atomic() to return a char*, and kunmap_atomic() to take a char*, so misuse got compile warnings. But it was a pig to maintain so I tossed it. It'd be somewhat easier to do now we've converted a lot of callers to clear_user_highpage() and similar. > - People easily get the km type wrong - they use KM_USERx in interrupt > context, or one of the irq variants without disabling interrupts. > > If we could just catch these two types of bugs, we've got a lot of these > problems covered. Here's the -mm debug patch: diff -puN arch/i386/mm/highmem.c~kmap_atomic-debugging arch/i386/mm/highmem.c --- a/arch/i386/mm/highmem.c~kmap_atomic-debugging +++ a/arch/i386/mm/highmem.c @@ -30,7 +30,44 @@ void *kmap_atomic(struct page *page, enu { enum fixed_addresses idx; unsigned long vaddr; + static unsigned warn_count = 10; + if (unlikely(warn_count == 0)) + goto skip; + + if (unlikely(in_interrupt())) { + if (in_irq()) { + if (type != KM_IRQ0 && type != KM_IRQ1 && + type != KM_BIO_SRC_IRQ && type != KM_BIO_DST_IRQ && + type != KM_BOUNCE_READ) { + WARN_ON(1); + warn_count--; + } + } else if (!irqs_disabled()) { /* softirq */ + if (type != KM_IRQ0 && type != KM_IRQ1 && + type != KM_SOFTIRQ0 && type != KM_SOFTIRQ1 && + type != KM_SKB_SUNRPC_DATA && + type != KM_SKB_DATA_SOFTIRQ && + type != KM_BOUNCE_READ) { + WARN_ON(1); + warn_count--; + } + } + } + + if (type == KM_IRQ0 || type == KM_IRQ1 || type == KM_BOUNCE_READ || + type == KM_BIO_SRC_IRQ || type == KM_BIO_DST_IRQ) { + if (!irqs_disabled()) { + WARN_ON(1); + warn_count--; + } + } else if (type == KM_SOFTIRQ0 || type == KM_SOFTIRQ1) { + if (irq_count() == 0 && !irqs_disabled()) { + WARN_ON(1); + warn_count--; + } + } +skip: /* even !CONFIG_PREEMPT needs this, for in_atomic in do_page_fault */ pagefault_disable(); _ ^ permalink raw reply [flat|nested] 31+ messages in thread
* Re: 2.6.23-rc1: BUG_ON in kmap_atomic_prot() 2007-07-24 8:34 ` Andrew Morton @ 2007-07-24 14:00 ` Dan Williams 0 siblings, 0 replies; 31+ messages in thread From: Dan Williams @ 2007-07-24 14:00 UTC (permalink / raw) To: Andrew Morton Cc: Jens Axboe, Alexey Dobriyan, Linus Torvalds, linux-kernel, netdev, mark.fasheh, Nelson, Shannon On 7/24/07, Andrew Morton <akpm@linux-foundation.org> wrote: [...] > > What about the new async crypto stuff? I've been looking, but is it > > guarenteed that async_memcpy() runs in process context with interrupts > > enabled always? If not, there's a km type bug there. > > I think Shannon maintains that now. > I am looking after the async_tx API, I will send a patch to update MAINTAINERS shortly. -- Dan ^ permalink raw reply [flat|nested] 31+ messages in thread
* Re: 2.6.23-rc1: BUG_ON in kmap_atomic_prot() 2007-07-24 8:22 ` Jens Axboe 2007-07-24 8:34 ` Andrew Morton @ 2007-07-24 13:55 ` Dan Williams 1 sibling, 0 replies; 31+ messages in thread From: Dan Williams @ 2007-07-24 13:55 UTC (permalink / raw) To: Jens Axboe Cc: Andrew Morton, Alexey Dobriyan, Linus Torvalds, linux-kernel, netdev, mark.fasheh On 7/24/07, Jens Axboe <jens.axboe@oracle.com> wrote: > What about the new async crypto stuff? I've been looking, but is it > guarenteed that async_memcpy() runs in process context with interrupts > enabled always? If not, there's a km type bug there. > Currently the only user is the MD raid456 driver, and yes, it only performs copies from the handle_stripe routine which is always run in process context with interrupts enabled. However this is not documented. Would it be advisable to add a WARN_ON for this condition? > In general, I think the highmem stuff could do with more safety checks: > > - People ALWAYS get the atomic unmaps wrong, passing in the page instead > of the address. I've seen tons of these. And since kunmap_atomic() > takes a void pointer, nobody notices until it goes boom. > - People easily get the km type wrong - they use KM_USERx in interrupt > context, or one of the irq variants without disabling interrupts. > > If we could just catch these two types of bugs, we've got a lot of these > problems covered. > > > -- > Jens Axboe > -- Dan ^ permalink raw reply [flat|nested] 31+ messages in thread
* Re: 2.6.23-rc1: BUG_ON in kmap_atomic_prot() 2007-07-23 20:24 ` 2.6.23-rc1: BUG_ON in kmap_atomic_prot() Andrew Morton 2007-07-23 20:40 ` Alexey Dobriyan @ 2007-07-24 10:01 ` Mike Galbraith 2007-07-24 10:37 ` Mike Galbraith 2007-07-24 16:28 ` Andrew Morton 1 sibling, 2 replies; 31+ messages in thread From: Mike Galbraith @ 2007-07-24 10:01 UTC (permalink / raw) To: Andrew Morton; +Cc: Alexey Dobriyan, Linus Torvalds, linux-kernel, netdev On Mon, 2007-07-23 at 13:24 -0700, Andrew Morton wrote: > You're using DEBUG_PAGEALLOC, but I was not, so I think we can rule that out. My box bugged during boot the first time I booted 23-rc1, but nothing made it to the console, and I didn't have a serial console running. I didn't have DEBUG_PAGEALLOC or friends set. > I haven't worked out where that kmap_atomic() call is coming from yet. > Both traces point up into the page allocator, but I _think_ that's stack > gunk. I just enabled all debug options, and was just rewarded with the below. [ 119.079531] eth1: link up, 100Mbps, full-duplex, lpa 0x45E1 [ 119.558867] ------------[ cut here ]------------ [ 119.572197] kernel BUG at arch/i386/mm/highmem.c:38! [ 119.585804] invalid opcode: 0000 [#1] [ 119.598013] PREEMPT SMP DEBUG_PAGEALLOC [ 119.610103] Modules linked in: edd button battery ac ip6t_REJECT xt_tcpudp ipt_REJECT xt_state iptable_mangle iptable_nat nf_nat iptable_filter ip6table_mangle nf_conntrack_ipv4 nf_conntrack nfnetlink ip_tables ip6table_filter ip6_tables x_tables nls_iso8859_1 nls_cp437 nls_utf8 snd_intel8x0 snd_ac97_codec ac97_bus snd_mpu401 snd_pcm prism54 snd_timer snd_mpu401_uart snd_rawmidi snd_seq_device snd intel_agp agpgart soundcore snd_page_alloc i2c_i801 fan thermal processor [ 119.698063] CPU: 1 [ 119.698065] EIP: 0060:[<c011cd2d>] Not tainted VLI [ 119.698067] EFLAGS: 00010006 (2.6.23-rc1-smp #75) [ 119.736358] EIP is at kmap_atomic_prot+0xa7/0xab [ 119.749647] eax: 3d07f163 ebx: c166db80 ecx: c0750e60 edx: 00000007 [ 119.765417] esi: 00000022 edi: 00000163 ebp: c069dcd4 esp: c069dcc8 [ 119.781273] ds: 007b es: 007b fs: 00d8 gs: 0033 ss: 0068 [ 119.796378] Process udevd (pid: 4775, ti=c069d000 task=f31aea60 task.ti=f477d000) [ 119.804068] Stack: c166db80 00000000 c166db80 c069dcdc c011cd3f c069dd40 c015b6e0 00000001 [ 119.822272] 00000044 00000163 00000000 00000001 c165f4e0 00000001 c165f4e0 00000001 [ 119.840762] 00000000 00028020 c061e71c c166db80 00000046 00000080 00000001 c011e4de [ 119.859389] Call Trace: [ 119.881302] [<c0105144>] show_trace_log_lvl+0x1a/0x30 [ 119.896319] [<c01051ff>] show_stack_log_lvl+0xa5/0xca [ 119.911171] [<c0105420>] show_registers+0x1fc/0x343 [ 119.925756] [<c0105689>] die+0x122/0x249 [ 119.939241] [<c0105834>] do_trap+0x84/0xad [ 119.952897] [<c0105b1c>] do_invalid_op+0x88/0x92 [ 119.967118] [<c04cf3c2>] error_code+0x72/0x78 [ 119.980948] [<c011cd3f>] kmap_atomic+0xe/0x10 [ 119.994642] [<c015b6e0>] get_page_from_freelist+0x39e/0x45e [ 120.009485] [<c015b7fb>] __alloc_pages+0x5b/0x2db [ 120.023342] [<c0172872>] cache_alloc_refill+0x380/0x6f2 [ 120.037623] [<c0172e7a>] kmem_cache_alloc+0xa1/0xa5 [ 120.051426] [<c03fb397>] neigh_create+0x5f/0x506 [ 120.064894] [<c046e25d>] ndisc_dst_alloc+0x122/0x151 [ 120.078769] [<c0471b0b>] __ndisc_send+0x8d/0x4fa [ 120.092340] [<c0472915>] ndisc_send_ns+0x5f/0x7d [ 120.105848] [<c0469ff5>] addrconf_dad_timer+0xdb/0xe0 [ 120.119758] [<c012f8a0>] run_timer_softirq+0x130/0x191 [ 120.133717] [<c012c06d>] __do_softirq+0x76/0xe4 [ 120.147475] [<c0106b48>] do_softirq+0x63/0xac [ 120.147488] [<c012bff5>] (gdb) list *neigh_create+0x5f 0xc03fb397 is in neigh_create (include/linux/slab.h:259). 254 /* 255 * Shortcuts 256 */ 257 static inline void *kmem_cache_zalloc(struct kmem_cache *k, gfp_t flags) 258 { 259 return kmem_cache_alloc(k, flags | __GFP_ZERO); 260 } 261 262 /** 263 * kzalloc - allocate memory. The memory is set to zero. (gdb) list *kmem_cache_alloc+0xa1 0xc0172e7a is in kmem_cache_alloc (mm/slab.c:3176). 3171 STATS_INC_ALLOCHIT(cachep); 3172 ac->touched = 1; 3173 objp = ac->entry[--ac->avail]; 3174 } else { 3175 STATS_INC_ALLOCMISS(cachep); 3176 objp = cache_alloc_refill(cachep, flags); 3177 } 3178 return objp; 3179 } 3180 (gdb) list *cache_alloc_refill+0x380 0xc0172872 is in cache_alloc_refill (include/linux/gfp.h:154). 149 150 /* Unknown node is current node */ 151 if (nid < 0) 152 nid = numa_node_id(); 153 154 return __alloc_pages(gfp_mask, order, 155 NODE_DATA(nid)->node_zonelists + gfp_zone(gfp_mask)); 156 } 157 158 #ifdef CONFIG_NUMA (gdb) list *__alloc_pages+0x5b 0xc015b7fb is in __alloc_pages (mm/page_alloc.c:1248). 1243 if (unlikely(*z == NULL)) { 1244 /* Should this ever happen?? */ 1245 return NULL; 1246 } 1247 1248 page = get_page_from_freelist(gfp_mask|__GFP_HARDWALL, order, 1249 zonelist, ALLOC_WMARK_LOW|ALLOC_CPUSET); 1250 if (page) 1251 goto got_pg; 1252 (gdb) list *get_page_from_freelist+0x39e 0xc015b6e0 is in get_page_from_freelist (include/linux/highmem.h:122). 117 return __alloc_zeroed_user_highpage(__GFP_MOVABLE, vma, vaddr); 118 } 119 120 static inline void clear_highpage(struct page *page) 121 { 122 void *kaddr = kmap_atomic(page, KM_USER0); 123 clear_page(kaddr); 124 kunmap_atomic(kaddr, KM_USER0); 125 } 126 ^ permalink raw reply [flat|nested] 31+ messages in thread
* Re: 2.6.23-rc1: BUG_ON in kmap_atomic_prot() 2007-07-24 10:01 ` Mike Galbraith @ 2007-07-24 10:37 ` Mike Galbraith 2007-07-24 16:28 ` Andrew Morton 1 sibling, 0 replies; 31+ messages in thread From: Mike Galbraith @ 2007-07-24 10:37 UTC (permalink / raw) To: Andrew Morton; +Cc: Alexey Dobriyan, Linus Torvalds, linux-kernel, netdev On Tue, 2007-07-24 at 12:01 +0200, Mike Galbraith wrote: > On Mon, 2007-07-23 at 13:24 -0700, Andrew Morton wrote: > > > You're using DEBUG_PAGEALLOC, but I was not, so I think we can rule that out. > > My box bugged during boot the first time I booted 23-rc1, but nothing > made it to the console, and I didn't have a serial console running. I > didn't have DEBUG_PAGEALLOC or friends set. > > > I haven't worked out where that kmap_atomic() call is coming from yet. > > Both traces point up into the page allocator, but I _think_ that's stack > > gunk. > > I just enabled all debug options, and was just rewarded with the below. Hm. I just also experienced filesystem corruption when I tried to send from that kernel, and it bugged in the process. My mount table ended up in /etc/resolv.conf along with some binary goop, making nscd rather unhappy after reboot. fsck time. .Mike ^ permalink raw reply [flat|nested] 31+ messages in thread
* Re: 2.6.23-rc1: BUG_ON in kmap_atomic_prot() 2007-07-24 10:01 ` Mike Galbraith 2007-07-24 10:37 ` Mike Galbraith @ 2007-07-24 16:28 ` Andrew Morton 2007-07-24 18:25 ` Linus Torvalds 1 sibling, 1 reply; 31+ messages in thread From: Andrew Morton @ 2007-07-24 16:28 UTC (permalink / raw) To: Mike Galbraith Cc: Alexey Dobriyan, Linus Torvalds, linux-kernel, netdev, Christoph Lameter On Tue, 24 Jul 2007 12:01:09 +0200 Mike Galbraith <efault@gmx.de> wrote: > On Mon, 2007-07-23 at 13:24 -0700, Andrew Morton wrote: > > > You're using DEBUG_PAGEALLOC, but I was not, so I think we can rule that out. > > My box bugged during boot the first time I booted 23-rc1, but nothing > made it to the console, and I didn't have a serial console running. I > didn't have DEBUG_PAGEALLOC or friends set. > > > I haven't worked out where that kmap_atomic() call is coming from yet. > > Both traces point up into the page allocator, but I _think_ that's stack > > gunk. > > I just enabled all debug options, and was just rewarded with the below. doh. It's a slab bug. > [ 119.079531] eth1: link up, 100Mbps, full-duplex, lpa 0x45E1 > [ 119.558867] ------------[ cut here ]------------ > [ 119.572197] kernel BUG at arch/i386/mm/highmem.c:38! > [ 119.585804] invalid opcode: 0000 [#1] > [ 119.598013] PREEMPT SMP DEBUG_PAGEALLOC > [ 119.610103] Modules linked in: edd button battery ac ip6t_REJECT xt_tcpudp ipt_REJECT xt_state iptable_mangle iptable_nat nf_nat iptable_filter ip6table_mangle nf_conntrack_ipv4 nf_conntrack nfnetlink ip_tables ip6table_filter ip6_tables x_tables nls_iso8859_1 nls_cp437 nls_utf8 snd_intel8x0 snd_ac97_codec ac97_bus snd_mpu401 snd_pcm prism54 snd_timer snd_mpu401_uart snd_rawmidi snd_seq_device snd intel_agp agpgart soundcore snd_page_alloc i2c_i801 fan thermal processor > [ 119.698063] CPU: 1 > [ 119.698065] EIP: 0060:[<c011cd2d>] Not tainted VLI > [ 119.698067] EFLAGS: 00010006 (2.6.23-rc1-smp #75) > [ 119.736358] EIP is at kmap_atomic_prot+0xa7/0xab > [ 119.749647] eax: 3d07f163 ebx: c166db80 ecx: c0750e60 edx: 00000007 > [ 119.765417] esi: 00000022 edi: 00000163 ebp: c069dcd4 esp: c069dcc8 > [ 119.781273] ds: 007b es: 007b fs: 00d8 gs: 0033 ss: 0068 > [ 119.796378] Process udevd (pid: 4775, ti=c069d000 task=f31aea60 task.ti=f477d000) > [ 119.804068] Stack: c166db80 00000000 c166db80 c069dcdc c011cd3f c069dd40 c015b6e0 00000001 > [ 119.822272] 00000044 00000163 00000000 00000001 c165f4e0 00000001 c165f4e0 00000001 > [ 119.840762] 00000000 00028020 c061e71c c166db80 00000046 00000080 00000001 c011e4de > [ 119.859389] Call Trace: > [ 119.881302] [<c0105144>] show_trace_log_lvl+0x1a/0x30 > [ 119.896319] [<c01051ff>] show_stack_log_lvl+0xa5/0xca > [ 119.911171] [<c0105420>] show_registers+0x1fc/0x343 > [ 119.925756] [<c0105689>] die+0x122/0x249 > [ 119.939241] [<c0105834>] do_trap+0x84/0xad > [ 119.952897] [<c0105b1c>] do_invalid_op+0x88/0x92 > [ 119.967118] [<c04cf3c2>] error_code+0x72/0x78 > [ 119.980948] [<c011cd3f>] kmap_atomic+0xe/0x10 > [ 119.994642] [<c015b6e0>] get_page_from_freelist+0x39e/0x45e > [ 120.009485] [<c015b7fb>] __alloc_pages+0x5b/0x2db > [ 120.023342] [<c0172872>] cache_alloc_refill+0x380/0x6f2 > [ 120.037623] [<c0172e7a>] kmem_cache_alloc+0xa1/0xa5 > [ 120.051426] [<c03fb397>] neigh_create+0x5f/0x506 > [ 120.064894] [<c046e25d>] ndisc_dst_alloc+0x122/0x151 > [ 120.078769] [<c0471b0b>] __ndisc_send+0x8d/0x4fa > [ 120.092340] [<c0472915>] ndisc_send_ns+0x5f/0x7d > [ 120.105848] [<c0469ff5>] addrconf_dad_timer+0xdb/0xe0 > [ 120.119758] [<c012f8a0>] run_timer_softirq+0x130/0x191 > [ 120.133717] [<c012c06d>] __do_softirq+0x76/0xe4 > [ 120.147475] [<c0106b48>] do_softirq+0x63/0xac > [ 120.147488] [<c012bff5>] > (gdb) list *neigh_create+0x5f > 0xc03fb397 is in neigh_create (include/linux/slab.h:259). > 254 /* > 255 * Shortcuts > 256 */ > 257 static inline void *kmem_cache_zalloc(struct kmem_cache *k, gfp_t flags) > 258 { > 259 return kmem_cache_alloc(k, flags | __GFP_ZERO); > 260 } See, networking's kmem_cache_alloc(..., __GFP_ZERO) ended up calling into the page allocator with __GFP_ZERO. This is the bug - slab isn't supposed to do that: the __GFP_ZERO is supposed to be removed. Now, it's not a highmem page, so prep_zero_page() won't actually establish a kmap, but it will check that the kmap slot is presently unused on this CPU. But networking calls in here from softirq context (illegal for KM_USER0) and sometimes that KM_USER0 slot *will* be in use, so kmap_atomic_prot() will go BUG. I must say it's really really scary that such a low-level function as prep_zero_page() is using KM_USER0. I don't think it has enough debugging checks in there to prevent Bad Stuff from going undetected. I guess this was the bug: --- a/mm/slab.c~a +++ a/mm/slab.c @@ -2776,7 +2776,7 @@ static int cache_grow(struct kmem_cache * 'nodeid'. */ if (!objp) - objp = kmem_getpages(cachep, flags, nodeid); + objp = kmem_getpages(cachep, local_flags, nodeid); if (!objp) goto failed; _ I don't see why you later got fs corruption - afacit we won't actually modify the KM_USER0 slot in this scenario. > 262 /** > 263 * kzalloc - allocate memory. The memory is set to zero. > (gdb) list *kmem_cache_alloc+0xa1 > 0xc0172e7a is in kmem_cache_alloc (mm/slab.c:3176). > 3171 STATS_INC_ALLOCHIT(cachep); > 3172 ac->touched = 1; > 3173 objp = ac->entry[--ac->avail]; > 3174 } else { > 3175 STATS_INC_ALLOCMISS(cachep); > 3176 objp = cache_alloc_refill(cachep, flags); > 3177 } > 3178 return objp; > 3179 } > 3180 > (gdb) list *cache_alloc_refill+0x380 > 0xc0172872 is in cache_alloc_refill (include/linux/gfp.h:154). > 149 > 150 /* Unknown node is current node */ > 151 if (nid < 0) > 152 nid = numa_node_id(); > 153 > 154 return __alloc_pages(gfp_mask, order, > 155 NODE_DATA(nid)->node_zonelists + gfp_zone(gfp_mask)); > 156 } > 157 > 158 #ifdef CONFIG_NUMA > (gdb) list *__alloc_pages+0x5b > 0xc015b7fb is in __alloc_pages (mm/page_alloc.c:1248). > 1243 if (unlikely(*z == NULL)) { > 1244 /* Should this ever happen?? */ > 1245 return NULL; > 1246 } > 1247 > 1248 page = get_page_from_freelist(gfp_mask|__GFP_HARDWALL, order, > 1249 zonelist, ALLOC_WMARK_LOW|ALLOC_CPUSET); > 1250 if (page) > 1251 goto got_pg; > 1252 > (gdb) list *get_page_from_freelist+0x39e > 0xc015b6e0 is in get_page_from_freelist (include/linux/highmem.h:122). > 117 return __alloc_zeroed_user_highpage(__GFP_MOVABLE, vma, vaddr); > 118 } > 119 > 120 static inline void clear_highpage(struct page *page) > 121 { > 122 void *kaddr = kmap_atomic(page, KM_USER0); > 123 clear_page(kaddr); > 124 kunmap_atomic(kaddr, KM_USER0); > 125 } > 126 > > ^ permalink raw reply [flat|nested] 31+ messages in thread
* Re: 2.6.23-rc1: BUG_ON in kmap_atomic_prot() 2007-07-24 16:28 ` Andrew Morton @ 2007-07-24 18:25 ` Linus Torvalds 2007-07-24 20:05 ` Alexey Dobriyan 2007-07-25 5:09 ` Mike Galbraith 0 siblings, 2 replies; 31+ messages in thread From: Linus Torvalds @ 2007-07-24 18:25 UTC (permalink / raw) To: Andrew Morton Cc: Mike Galbraith, Alexey Dobriyan, linux-kernel, netdev, Christoph Lameter On Tue, 24 Jul 2007, Andrew Morton wrote: > > I guess this was the bug: Looks very likely to me. Mike, Alexey, does this fix things for you? Linus ^ permalink raw reply [flat|nested] 31+ messages in thread
* Re: 2.6.23-rc1: BUG_ON in kmap_atomic_prot() 2007-07-24 18:25 ` Linus Torvalds @ 2007-07-24 20:05 ` Alexey Dobriyan 2007-07-25 5:09 ` Mike Galbraith 1 sibling, 0 replies; 31+ messages in thread From: Alexey Dobriyan @ 2007-07-24 20:05 UTC (permalink / raw) To: Linus Torvalds Cc: Andrew Morton, Mike Galbraith, linux-kernel, netdev, Christoph Lameter On Tue, Jul 24, 2007 at 11:25:22AM -0700, Linus Torvalds wrote: > On Tue, 24 Jul 2007, Andrew Morton wrote: > > I guess this was the bug: > > Looks very likely to me. Mike, Alexey, does this fix things for you? Yeah, box is running for more than hour, survived LTP, gdb testsuite, portage sync and so on. What new to me is that someone decided TSC is unstable but that's probably OK on full debug kernel: Clocksource tsc unstable (delta = 91724418 ns) Time: pit clocksource has been installed. ^ permalink raw reply [flat|nested] 31+ messages in thread
* Re: 2.6.23-rc1: BUG_ON in kmap_atomic_prot() 2007-07-24 18:25 ` Linus Torvalds 2007-07-24 20:05 ` Alexey Dobriyan @ 2007-07-25 5:09 ` Mike Galbraith 1 sibling, 0 replies; 31+ messages in thread From: Mike Galbraith @ 2007-07-25 5:09 UTC (permalink / raw) To: Linus Torvalds Cc: Andrew Morton, Alexey Dobriyan, linux-kernel, netdev, Christoph Lameter On Tue, 2007-07-24 at 11:25 -0700, Linus Torvalds wrote: > > On Tue, 24 Jul 2007, Andrew Morton wrote: > > > > I guess this was the bug: > > Looks very likely to me. Mike, Alexey, does this fix things for you? I don't have very much runtime on it yet, but yes, it seems to have. -Mike ^ permalink raw reply [flat|nested] 31+ messages in thread
end of thread, other threads:[~2007-07-26 18:19 UTC | newest]
Thread overview: 31+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
[not found] <alpine.LFD.0.999.0707221351030.3607@woody.linux-foundation.org>
[not found] ` <20070723183839.GA5874@martell.zuzino.mipt.ru>
[not found] ` <20070723190152.GA5755@martell.zuzino.mipt.ru>
2007-07-23 20:24 ` 2.6.23-rc1: BUG_ON in kmap_atomic_prot() Andrew Morton
2007-07-23 20:40 ` Alexey Dobriyan
2007-07-23 21:01 ` Alexey Dobriyan
2007-07-23 21:11 ` Andrew Morton
2007-07-23 21:28 ` Linus Torvalds
2007-07-23 21:37 ` Sam Ravnborg
2007-07-24 17:59 ` Adrian Bunk
2007-07-24 18:14 ` Linus Torvalds
2007-07-24 18:28 ` Andrew Morton
2007-07-24 19:15 ` Linus Torvalds
2007-07-24 19:40 ` Adrian Bunk
2007-07-24 19:48 ` Linus Torvalds
2007-07-26 18:07 ` Adrian Bunk
2007-07-26 18:19 ` Linus Torvalds
2007-07-24 20:27 ` Andi Kleen
2007-07-24 19:45 ` Linus Torvalds
2007-07-26 6:09 ` commit 7e92b4fc34 - x86, serial: convert legacy COM ports to platform devices - broke my serial console H. Peter Anvin
2007-07-23 22:04 ` 2.6.23-rc1: BUG_ON in kmap_atomic_prot() Alexey Dobriyan
2007-07-23 22:27 ` Andrew Morton
2007-07-24 5:20 ` Alexey Dobriyan
2007-07-24 8:17 ` Jens Axboe
2007-07-24 8:22 ` Jens Axboe
2007-07-24 8:34 ` Andrew Morton
2007-07-24 14:00 ` Dan Williams
2007-07-24 13:55 ` Dan Williams
2007-07-24 10:01 ` Mike Galbraith
2007-07-24 10:37 ` Mike Galbraith
2007-07-24 16:28 ` Andrew Morton
2007-07-24 18:25 ` Linus Torvalds
2007-07-24 20:05 ` Alexey Dobriyan
2007-07-25 5:09 ` Mike Galbraith
This is a public inbox, see mirroring instructions for how to clone and mirror all data and code used for this inbox; as well as URLs for NNTP newsgroup(s).