* Oops when trying to create more than 16000 timers @ 2008-12-15 16:04 Ottavio Campana 2008-12-15 16:45 ` Pekka Enberg 0 siblings, 1 reply; 7+ messages in thread From: Ottavio Campana @ 2008-12-15 16:04 UTC (permalink / raw) To: linux-kernel Hello, I'm sorry if it is not the best place to ask, but I haven't find any answer. I am currently developing a software that needs approx 60k timers. I currently use timer_create and all the relative functions to manage timers. I've noticed that after having created 16039 timers I always get an oops from the kernel, which are always of the same kind: Dec 15 15:20:00 evolution kernel: [601680.417064] BUG: unable to handle kernel NULL pointer dereference at 0000000000000040 . I tried looking at kernel/posix-timers.c and I discovered that posix timers are kept in slab memory. By monitoring the my program I can see that in /proc/slabinfo the number of timers increases till 16039, when I get the oops and no more timers are created. In /prob/slabinfo I have posix_timers_cache 0 0 192 20 1 : tunables 120 60 8 : slabdata 0 0 0 If I try to increase tunables to 2400 120 16 the maximum number of posix timers that I can create does not change. How can I tune the kernel so that it supports 60k or more timers? Thank for your help, Ottavio -- Ottavio Campana Telecommunication Engineer Lab. Immagini Dept. of Information Engineering University of Padova Via Gradenigo 6/B 35131 Padova Italy phone: +39 049 8277641 fax: +39 049 8277699 e-mail: ottavio.campana@dei.unipd.it ^ permalink raw reply [flat|nested] 7+ messages in thread
* Re: Oops when trying to create more than 16000 timers 2008-12-15 16:04 Oops when trying to create more than 16000 timers Ottavio Campana @ 2008-12-15 16:45 ` Pekka Enberg 2008-12-15 21:41 ` Ottavio Campana 0 siblings, 1 reply; 7+ messages in thread From: Pekka Enberg @ 2008-12-15 16:45 UTC (permalink / raw) To: Ottavio Campana; +Cc: linux-kernel Hi Ottavio, On Mon, Dec 15, 2008 at 6:04 PM, Ottavio Campana <ottavio.campana@dei.unipd.it> wrote: > I am currently developing a software that needs approx 60k timers. I > currently use timer_create and all the relative functions to manage timers. > > I've noticed that after having created 16039 timers I always get an oops > from the kernel, which are always of the same kind: > > Dec 15 15:20:00 evolution kernel: [601680.417064] BUG: unable to handle > kernel NULL pointer dereference at 0000000000000040 . This would be a kernel bug so can you please post the full oops. See REPORTING-BUGS and Documentation/oops-tracing.txt for details. ^ permalink raw reply [flat|nested] 7+ messages in thread
* Re: Oops when trying to create more than 16000 timers 2008-12-15 16:45 ` Pekka Enberg @ 2008-12-15 21:41 ` Ottavio Campana 2008-12-15 21:37 ` Pekka Enberg 0 siblings, 1 reply; 7+ messages in thread From: Ottavio Campana @ 2008-12-15 21:41 UTC (permalink / raw) To: Pekka Enberg; +Cc: linux-kernel Pekka Enberg wrote: > Hi Ottavio, > > On Mon, Dec 15, 2008 at 6:04 PM, Ottavio Campana > <ottavio.campana@dei.unipd.it> wrote: >> I am currently developing a software that needs approx 60k timers. I >> currently use timer_create and all the relative functions to manage timers. >> >> I've noticed that after having created 16039 timers I always get an oops >> from the kernel, which are always of the same kind: >> >> Dec 15 15:20:00 evolution kernel: [601680.417064] BUG: unable to handle >> kernel NULL pointer dereference at 0000000000000040 . > > This would be a kernel bug so can you please post the full oops. See > REPORTING-BUGS and Documentation/oops-tracing.txt for details. I hope the report is correct, please tell me if you want me to do something else. PGD 7adfc067 PUD 7ad5e067 PMD 0 CPU 1 Modules linked in: ipv6 dm_snapshot dm_mirror dm_log dm_mod loop e1000 snd_hda_intel snd_pcm_oss snd_pcm snd_mixer_oss snd_seq_dummy snd_seq_oss snd_seq_midi_event snd_seq snd_timer snd_seq_device parport_pc psmouse iTCO_wdt snd parport i2c_i801 serio_raw soundcore i2c_core pcspkr snd_page_alloc evdev button intel_agp dcdbas ext3 jbd mbcache sg sr_mod cdrom sd_mod ata_piix uhci_hcd ehci_hcd e1000e ide_pci_generic ide_core ata_generic libata scsi_mod dock thermal processor fan thermal_sys Pid: 2743, comm: timer_tester Not tainted 2.6.26-1-amd64 #1 RIP: 0010:[<ffffffff80245291>] [<ffffffff80245291>] sys_timer_create+0x79/0x360 RSP: 0018:ffff81007a5ddef8 EFLAGS: 00010286 RAX: 0000000000000001 RBX: 0000000000000000 RCX: 0000000000000086 RDX: 0000000000000000 RSI: 0000000000000000 RDI: 0000000000000286 RBP: 0000000000000000 R08: 0000000000000004 R09: 0000000000000001 R10: 00007f927adb3a50 R11: 0000000000000000 R12: 00000000016ed768 R13: 00000000016ed6d0 R14: 00007fff833dead0 R15: 00000000016ed754 FS: 00007f927b3cd6e0(0000) GS:ffff81007d37a9c0(0000) knlGS:0000000000000000 CS: 0010 DS: 0000 ES: 0000 CR0: 000000008005003b CR2: 0000000000000040 CR3: 000000007b00c000 CR4: 00000000000006e0 DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000 DR3: 0000000000000000 DR6: 00000000ffff0ff0 DR7: 0000000000000400 Process timer_tester (pid: 2743, threadinfo ffff81007a5dc000, task ffff81007a7936b0) Stack: ffff81007646f9f8 0000000000000000 0000000000000000 ffffffff8029b4ed 0000000000000000 ffff81007b024c80 0000000000000017 fffffffffffffff7 00007f927b3da000 ffffffff8029ba36 0000000000000292 0000000000000000 Call Trace: [<ffffffff8029b4ed>] vfs_write+0x121/0x156 [<ffffffff8029ba36>] sys_write+0x60/0x6e [<ffffffff8020beca>] system_call_after_swapgs+0x8a/0x8f Code: c6 0e 05 00 48 85 c0 74 3b 48 89 c5 e8 d0 a0 ff ff 48 85 c0 48 89 45 40 75 11 48 8b 3d e1 22 3e 00 48 89 ee 31 ed e8 fe 10 05 00 <48> 8b 7d 40 31 f6 ba 80 00 00 00 48 83 c7 18 e8 6b ae 0d 00 48 RIP [<ffffffff80245291>] sys_timer_create+0x79/0x360 RSP <ffff81007a5ddef8> CR2: 0000000000000040 ---[ end trace 2e93d77cdbbd83c6 ]--- Using decodecode Code: c6 0e 05 00 48 85 c0 74 3b 48 89 c5 e8 d0 a0 ff ff 48 85 c0 48 89 45 40 75 11 48 8b 3d e1 22 3e 00 48 89 ee 31 ed e8 fe 10 05 00 <48> 8b 7d 40 31 f6 ba 80 00 00 00 48 83 c7 18 e8 6b ae 0d 00 48 /tmp/tmp.CcCjFxHOMN.o: file format elf64-x86-64 Disassembly of section .text: 0000000000000000 <.text>: 0: c6 (bad) 1: 0e (bad) 2: 05 00 48 85 c0 add $0xc0854800,%eax 7: 74 3b je 0x44 9: 48 89 c5 mov %rax,%rbp c: e8 d0 a0 ff ff callq 0xffffffffffffa0e1 11: 48 85 c0 test %rax,%rax 14: 48 89 45 40 mov %rax,0x40(%rbp) 18: 75 11 jne 0x2b 1a: 48 8b 3d e1 22 3e 00 mov 0x3e22e1(%rip),%rdi # 0x3e2302 21: 48 89 ee mov %rbp,%rsi 24: 31 ed xor %ebp,%ebp 26: e8 fe 10 05 00 callq 0x51129 /tmp/tmp.CcCjFxHOMN.o: file format elf64-x86-64 Disassembly of section .text: 0000000000000000 <.text>: 0: 48 8b 7d 40 mov 0x40(%rbp),%rdi 4: 31 f6 xor %esi,%esi 6: ba 80 00 00 00 mov $0x80,%edx b: 48 83 c7 18 add $0x18,%rdi f: e8 6b ae 0d 00 callq 0xdae7f 14: 48 rex.W ^ permalink raw reply [flat|nested] 7+ messages in thread
* Re: Oops when trying to create more than 16000 timers 2008-12-15 21:41 ` Ottavio Campana @ 2008-12-15 21:37 ` Pekka Enberg 2008-12-15 22:46 ` Ottavio Campana 2008-12-16 23:18 ` Greg KH 0 siblings, 2 replies; 7+ messages in thread From: Pekka Enberg @ 2008-12-15 21:37 UTC (permalink / raw) To: Ottavio Campana Cc: linux-kernel, error27, Thomas Gleixner, Andrew Morton, torvalds, Greg KH Hi Ottavio, On Mon, Dec 15, 2008 at 11:41 PM, Ottavio Campana <ottavio.campana@dei.unipd.it> wrote: >>> I am currently developing a software that needs approx 60k timers. I >>> currently use timer_create and all the relative functions to manage timers. >>> >>> I've noticed that after having created 16039 timers I always get an oops >>> from the kernel, which are always of the same kind: >>> >>> Dec 15 15:20:00 evolution kernel: [601680.417064] BUG: unable to handle >>> kernel NULL pointer dereference at 0000000000000040 . >> >> This would be a kernel bug so can you please post the full oops. See >> REPORTING-BUGS and Documentation/oops-tracing.txt for details. > > I hope the report is correct, please tell me if you want me to do > something else. > > RIP: 0010:[<ffffffff80245291>] [<ffffffff80245291>] > sys_timer_create+0x79/0x360 > RSP: 0018:ffff81007a5ddef8 EFLAGS: 00010286 > RAX: 0000000000000001 RBX: 0000000000000000 RCX: 0000000000000086 > RDX: 0000000000000000 RSI: 0000000000000000 RDI: 0000000000000286 > RBP: 0000000000000000 R08: 0000000000000004 R09: 0000000000000001 > R10: 00007f927adb3a50 R11: 0000000000000000 R12: 00000000016ed768 > R13: 00000000016ed6d0 R14: 00007fff833dead0 R15: 00000000016ed754 > FS: 00007f927b3cd6e0(0000) GS:ffff81007d37a9c0(0000) knlGS:0000000000000000 > CS: 0010 DS: 0000 ES: 0000 CR0: 000000008005003b > CR2: 0000000000000040 CR3: 000000007b00c000 CR4: 00000000000006e0 > DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000 > DR3: 0000000000000000 DR6: 00000000ffff0ff0 DR7: 0000000000000400 > Process timer_tester (pid: 2743, threadinfo ffff81007a5dc000, task > ffff81007a7936b0) > Stack: ffff81007646f9f8 0000000000000000 0000000000000000 ffffffff8029b4ed > 0000000000000000 ffff81007b024c80 0000000000000017 fffffffffffffff7 > 00007f927b3da000 ffffffff8029ba36 0000000000000292 0000000000000000 > Call Trace: > [<ffffffff8029b4ed>] vfs_write+0x121/0x156 > [<ffffffff8029ba36>] sys_write+0x60/0x6e > [<ffffffff8020beca>] system_call_after_swapgs+0x8a/0x8f > > > Code: c6 0e 05 00 48 85 c0 74 3b 48 89 c5 e8 d0 a0 ff ff 48 85 c0 48 89 > 45 40 75 11 48 8b 3d e1 22 3e 00 48 89 ee 31 ed e8 fe 10 05 00 <48> 8b > 7d 40 31 f6 ba 80 00 00 00 48 83 c7 18 e8 6b ae 0d 00 48 > RIP [<ffffffff80245291>] sys_timer_create+0x79/0x360 > RSP <ffff81007a5ddef8> > CR2: 0000000000000040 > ---[ end trace 2e93d77cdbbd83c6 ]--- > > Using decodecode > > Code: c6 0e 05 00 48 85 c0 74 3b 48 89 c5 e8 d0 a0 ff ff 48 85 c0 48 89 > 45 40 75 11 48 8b 3d e1 22 3e 00 48 89 ee 31 ed e8 fe 10 05 00 <48> 8b > 7d 40 31 f6 ba 80 00 00 00 48 83 c7 18 e8 6b ae 0d 00 48 > > /tmp/tmp.CcCjFxHOMN.o: file format elf64-x86-64 > > Disassembly of section .text: > > 0000000000000000 <.text>: > 0: c6 (bad) > 1: 0e (bad) > 2: 05 00 48 85 c0 add $0xc0854800,%eax > 7: 74 3b je 0x44 > 9: 48 89 c5 mov %rax,%rbp > c: e8 d0 a0 ff ff callq 0xffffffffffffa0e1 > 11: 48 85 c0 test %rax,%rax > 14: 48 89 45 40 mov %rax,0x40(%rbp) > 18: 75 11 jne 0x2b > 1a: 48 8b 3d e1 22 3e 00 mov 0x3e22e1(%rip),%rdi # 0x3e2302 > 21: 48 89 ee mov %rbp,%rsi > 24: 31 ed xor %ebp,%ebp > 26: e8 fe 10 05 00 callq 0x51129 > > /tmp/tmp.CcCjFxHOMN.o: file format elf64-x86-64 > > Disassembly of section .text: > > 0000000000000000 <.text>: > 0: 48 8b 7d 40 mov 0x40(%rbp),%rdi > 4: 31 f6 xor %esi,%esi > 6: ba 80 00 00 00 mov $0x80,%edx > b: 48 83 c7 18 add $0x18,%rdi > f: e8 6b ae 0d 00 callq 0xdae7f > 14: 48 rex.W I think you're simply hitting RLIMIT_SIGPENDING and then tripping over a bug in alloc_posix_timer() that's fixed by commit aa94fbd5ccd840c8ab26d02439ec799b03a72547 ("fix error-path NULL deref in alloc_posix_timer()") in 2.6.28-rc8. Dan, can you please send your patch to the -stable queue as well? Pekka ^ permalink raw reply [flat|nested] 7+ messages in thread
* Re: Oops when trying to create more than 16000 timers 2008-12-15 21:37 ` Pekka Enberg @ 2008-12-15 22:46 ` Ottavio Campana 2008-12-15 21:44 ` Pekka Enberg 2008-12-16 23:18 ` Greg KH 1 sibling, 1 reply; 7+ messages in thread From: Ottavio Campana @ 2008-12-15 22:46 UTC (permalink / raw) To: Pekka Enberg Cc: linux-kernel, error27, Thomas Gleixner, Andrew Morton, torvalds, Greg KH Pekka Enberg wrote: > I think you're simply hitting RLIMIT_SIGPENDING and then tripping over > a bug in alloc_posix_timer() that's fixed by commit > aa94fbd5ccd840c8ab26d02439ec799b03a72547 ("fix error-path NULL deref > in alloc_posix_timer()") in 2.6.28-rc8. > > Dan, can you please send your patch to the -stable queue as well? Pekka, can this RLIMIT_SIGPENDING be modified? For my application, I will need 60k timers. Thanks, Ottavio ^ permalink raw reply [flat|nested] 7+ messages in thread
* Re: Oops when trying to create more than 16000 timers 2008-12-15 22:46 ` Ottavio Campana @ 2008-12-15 21:44 ` Pekka Enberg 0 siblings, 0 replies; 7+ messages in thread From: Pekka Enberg @ 2008-12-15 21:44 UTC (permalink / raw) To: Ottavio Campana Cc: linux-kernel, error27, Thomas Gleixner, Andrew Morton, torvalds, Greg KH Hi Ottavio, Pekka Enberg wrote: >> I think you're simply hitting RLIMIT_SIGPENDING and then tripping over >> a bug in alloc_posix_timer() that's fixed by commit >> aa94fbd5ccd840c8ab26d02439ec799b03a72547 ("fix error-path NULL deref >> in alloc_posix_timer()") in 2.6.28-rc8. >> >> Dan, can you please send your patch to the -stable queue as well? Ottavio Campana wrote: > Pekka, can this RLIMIT_SIGPENDING be modified? For my application, I > will need 60k timers. I'd assume you can do that via the setrlimit() system call. See the man page for details. ^ permalink raw reply [flat|nested] 7+ messages in thread
* Re: Oops when trying to create more than 16000 timers 2008-12-15 21:37 ` Pekka Enberg 2008-12-15 22:46 ` Ottavio Campana @ 2008-12-16 23:18 ` Greg KH 1 sibling, 0 replies; 7+ messages in thread From: Greg KH @ 2008-12-16 23:18 UTC (permalink / raw) To: Pekka Enberg Cc: Ottavio Campana, linux-kernel, error27, Thomas Gleixner, Andrew Morton, torvalds On Mon, Dec 15, 2008 at 11:37:38PM +0200, Pekka Enberg wrote: > Hi Ottavio, > > On Mon, Dec 15, 2008 at 11:41 PM, Ottavio Campana > <ottavio.campana@dei.unipd.it> wrote: > >>> I am currently developing a software that needs approx 60k timers. I > >>> currently use timer_create and all the relative functions to manage timers. > >>> > >>> I've noticed that after having created 16039 timers I always get an oops > >>> from the kernel, which are always of the same kind: > >>> > >>> Dec 15 15:20:00 evolution kernel: [601680.417064] BUG: unable to handle > >>> kernel NULL pointer dereference at 0000000000000040 . > >> > >> This would be a kernel bug so can you please post the full oops. See > >> REPORTING-BUGS and Documentation/oops-tracing.txt for details. > > > > I hope the report is correct, please tell me if you want me to do > > something else. > > > > RIP: 0010:[<ffffffff80245291>] [<ffffffff80245291>] > > sys_timer_create+0x79/0x360 > > RSP: 0018:ffff81007a5ddef8 EFLAGS: 00010286 > > RAX: 0000000000000001 RBX: 0000000000000000 RCX: 0000000000000086 > > RDX: 0000000000000000 RSI: 0000000000000000 RDI: 0000000000000286 > > RBP: 0000000000000000 R08: 0000000000000004 R09: 0000000000000001 > > R10: 00007f927adb3a50 R11: 0000000000000000 R12: 00000000016ed768 > > R13: 00000000016ed6d0 R14: 00007fff833dead0 R15: 00000000016ed754 > > FS: 00007f927b3cd6e0(0000) GS:ffff81007d37a9c0(0000) knlGS:0000000000000000 > > CS: 0010 DS: 0000 ES: 0000 CR0: 000000008005003b > > CR2: 0000000000000040 CR3: 000000007b00c000 CR4: 00000000000006e0 > > DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000 > > DR3: 0000000000000000 DR6: 00000000ffff0ff0 DR7: 0000000000000400 > > Process timer_tester (pid: 2743, threadinfo ffff81007a5dc000, task > > ffff81007a7936b0) > > Stack: ffff81007646f9f8 0000000000000000 0000000000000000 ffffffff8029b4ed > > 0000000000000000 ffff81007b024c80 0000000000000017 fffffffffffffff7 > > 00007f927b3da000 ffffffff8029ba36 0000000000000292 0000000000000000 > > Call Trace: > > [<ffffffff8029b4ed>] vfs_write+0x121/0x156 > > [<ffffffff8029ba36>] sys_write+0x60/0x6e > > [<ffffffff8020beca>] system_call_after_swapgs+0x8a/0x8f > > > > > > Code: c6 0e 05 00 48 85 c0 74 3b 48 89 c5 e8 d0 a0 ff ff 48 85 c0 48 89 > > 45 40 75 11 48 8b 3d e1 22 3e 00 48 89 ee 31 ed e8 fe 10 05 00 <48> 8b > > 7d 40 31 f6 ba 80 00 00 00 48 83 c7 18 e8 6b ae 0d 00 48 > > RIP [<ffffffff80245291>] sys_timer_create+0x79/0x360 > > RSP <ffff81007a5ddef8> > > CR2: 0000000000000040 > > ---[ end trace 2e93d77cdbbd83c6 ]--- > > > > Using decodecode > > > > Code: c6 0e 05 00 48 85 c0 74 3b 48 89 c5 e8 d0 a0 ff ff 48 85 c0 48 89 > > 45 40 75 11 48 8b 3d e1 22 3e 00 48 89 ee 31 ed e8 fe 10 05 00 <48> 8b > > 7d 40 31 f6 ba 80 00 00 00 48 83 c7 18 e8 6b ae 0d 00 48 > > > > /tmp/tmp.CcCjFxHOMN.o: file format elf64-x86-64 > > > > Disassembly of section .text: > > > > 0000000000000000 <.text>: > > 0: c6 (bad) > > 1: 0e (bad) > > 2: 05 00 48 85 c0 add $0xc0854800,%eax > > 7: 74 3b je 0x44 > > 9: 48 89 c5 mov %rax,%rbp > > c: e8 d0 a0 ff ff callq 0xffffffffffffa0e1 > > 11: 48 85 c0 test %rax,%rax > > 14: 48 89 45 40 mov %rax,0x40(%rbp) > > 18: 75 11 jne 0x2b > > 1a: 48 8b 3d e1 22 3e 00 mov 0x3e22e1(%rip),%rdi # 0x3e2302 > > 21: 48 89 ee mov %rbp,%rsi > > 24: 31 ed xor %ebp,%ebp > > 26: e8 fe 10 05 00 callq 0x51129 > > > > /tmp/tmp.CcCjFxHOMN.o: file format elf64-x86-64 > > > > Disassembly of section .text: > > > > 0000000000000000 <.text>: > > 0: 48 8b 7d 40 mov 0x40(%rbp),%rdi > > 4: 31 f6 xor %esi,%esi > > 6: ba 80 00 00 00 mov $0x80,%edx > > b: 48 83 c7 18 add $0x18,%rdi > > f: e8 6b ae 0d 00 callq 0xdae7f > > 14: 48 rex.W > > I think you're simply hitting RLIMIT_SIGPENDING and then tripping over > a bug in alloc_posix_timer() that's fixed by commit > aa94fbd5ccd840c8ab26d02439ec799b03a72547 ("fix error-path NULL deref > in alloc_posix_timer()") in 2.6.28-rc8. > > Dan, can you please send your patch to the -stable queue as well? > This patch is already in the 2.6.27 release. thanks, greg k-h ^ permalink raw reply [flat|nested] 7+ messages in thread
end of thread, other threads:[~2008-12-16 23:27 UTC | newest] Thread overview: 7+ messages (download: mbox.gz follow: Atom feed -- links below jump to the message on this page -- 2008-12-15 16:04 Oops when trying to create more than 16000 timers Ottavio Campana 2008-12-15 16:45 ` Pekka Enberg 2008-12-15 21:41 ` Ottavio Campana 2008-12-15 21:37 ` Pekka Enberg 2008-12-15 22:46 ` Ottavio Campana 2008-12-15 21:44 ` Pekka Enberg 2008-12-16 23:18 ` Greg KH
This is a public inbox, see mirroring instructions for how to clone and mirror all data and code used for this inbox