From mboxrd@z Thu Jan 1 00:00:00 1970 From: Stanislav Meduna Subject: Re: timerfd read does not return [Was: Re: timerfd and softirqd] Date: Fri, 19 Apr 2013 21:53:52 +0200 Message-ID: <5171A0D0.8050100@meduna.org> References: <516BDE52.90200@meduna.org> <516BF8FD.2000700@meduna.org> <516EC3F3.1080406@meduna.org> <516FB8B9.9090506@meduna.org> Mime-Version: 1.0 Content-Type: text/plain; charset=ISO-8859-1 Content-Transfer-Encoding: 7bit To: "linux-rt-users@vger.kernel.org" Return-path: Received: from www.meduna.org ([92.240.244.38]:35449 "EHLO meduna.org" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1753519Ab3DSTyG (ORCPT ); Fri, 19 Apr 2013 15:54:06 -0400 Received: from dial-95-105-165-4-orange.orange.sk ([95.105.165.4] helo=[192.168.130.22]) by meduna.org with esmtpsa (TLS1.0:DHE_RSA_AES_256_CBC_SHA1:32) (Exim 4.72) (envelope-from ) id 1UTHNd-00086l-05 for linux-rt-users@vger.kernel.org; Fri, 19 Apr 2013 21:54:00 +0200 In-Reply-To: <516FB8B9.9090506@meduna.org> Sender: linux-rt-users-owner@vger.kernel.org List-ID: On 18.04.2013 11:11, Stanislav Meduna wrote: > There is no return from the read until the RT throttler got activated. > Kernel version 3.4.25-rt37 - I'll check whether anything relevant > has changed since then. I changed the RT throttler to do show_state_filter(0); (sysrq-t). No idea whether this is the sane way to do in that context and how relevant the results are, but from the failing thread I am getting TimerT R running 0 671 636 0x10000000 ceb76a70 b7633158 c01169b0 ceb51ed4 c0116acb 00000029 ceb51e6c c012385d ceb51edc cea6d248 cd8024b0 cea6d200 00000002 00000029 c012385d 00000001 00000000 00000000 ceb51ea0 c0146476 00000030 ceb51ea8 c012385d ceb51ed4 Call Trace: [] ? mm_fault_error+0x170/0x170 [] ? do_page_fault+0x11b/0x390 [] ? irq_exit+0x6d/0x80 [] ? irq_exit+0x6d/0x80 [] ? sub_preempt_count+0x36/0x50 [] ? irq_exit+0x6d/0x80 [] ? do_IRQ+0x48/0x94 [] ? mm_fault_error+0x170/0x170 last message repeated 2 times [] ? error_code+0x5d/0x70 [] ? sys_rt_sigqueueinfo+0x70/0x90 [] ? mm_fault_error+0x170/0x170 [] ? __put_user_8+0x11/0x19 [] ? timerfd_read+0x183/0x280 [] ? wake_up_bit+0x60/0x60 [] ? vfs_read+0x9f/0x150 [] ? fget_light+0x87/0xe0 [] ? sys_timerfd_gettime+0x140/0x140 [] ? sys_read+0x42/0x70 [] ? syscall_call+0x7/0xb [] ? init_memory_mapping+0x1a0/0x410 I have two instances of this trace and one another: Call Trace: [] ? trace_hardirqs_on_thunk+0xc/0x10 [] ? restore_all+0xf/0xf [] ? cfq_completed_request+0x58b/0x5f0 [] ? __kunmap_atomic+0x50/0x60 [] ? handle_pte_fault+0x141/0xa00 [] ? handle_mm_fault+0x80/0xc0 [] ? rt_up_read+0x1d/0x30 [] ? do_page_fault+0x168/0x390 [] ? irq_exit+0x6d/0x80 [] ? irq_to_desc+0x14/0x20 [] ? handle_irq+0x1f/0x90 [] ? do_IRQ+0x3f/0x94 [] ? x86_schedule_events+0x34b/0x4b0 [] ? common_interrupt+0x2e/0x40 [] ? sys_rt_sigqueueinfo+0x70/0x90 [] ? __put_user_8+0x11/0x19 [] ? timerfd_read+0x183/0x280 [] ? wake_up_bit+0x60/0x60 [] ? vfs_read+0x9f/0x150 [] ? fget_light+0x87/0xe0 [] ? sys_timerfd_gettime+0x140/0x140 [] ? sys_read+0x42/0x70 [] ? syscall_call+0x7/0xb The __put_user_8 looks like the put_user at the end of the timerfd_read. Where does the sys_rt_sigqueueinfo come from? The read destination is just a normal 8-byte variable on stack, I don't see how this can become invalid while the thread is in kernel uint64_t exp; ... read(tmrFd, &exp, sizeof(exp)) The timer fd _does_ eventually return - after 200 ms to 5 seconds - and works normally afterwards. Unfortunately I don't know whether it returns with 8 or with -EFAULT. In any case there is no crash, I do not catch SIGSEGV, SIGBUS and similar. I'm puzzled. Does this make sense to anyone? Is this better suited for the general Linux list? Unrelated questions: I am not so familiar with Linux on this level, but shouldn't interrupts be handled on their own stack? Thanks -- Stano