From mboxrd@z Thu Jan 1 00:00:00 1970 From: Stanislav Meduna Subject: Livelock in handle_pte_fault [Was: Re: timerfd read does not return] Date: Tue, 14 May 2013 10:31:09 +0200 Message-ID: <5191F64D.2050509@meduna.org> References: <516BDE52.90200@meduna.org> <516BF8FD.2000700@meduna.org> <516EC3F3.1080406@meduna.org> <516FB8B9.9090506@meduna.org> <517B8D91.4010700@meduna.org> <518CEB45.9080705@meduna.org> <519023C0.2030603@meduna.org> <51909EAE.8070901@meduna.org> Mime-Version: 1.0 Content-Type: text/plain; charset=ISO-8859-1 Content-Transfer-Encoding: 7bit Cc: rostedt@goodmis.org, Thomas Gleixner , Carsten Emde To: "linux-rt-users@vger.kernel.org" Return-path: Received: from www.meduna.org ([92.240.244.38]:43977 "EHLO meduna.org" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1756600Ab3ENIbU (ORCPT ); Tue, 14 May 2013 04:31:20 -0400 In-Reply-To: <51909EAE.8070901@meduna.org> Sender: linux-rt-users-owner@vger.kernel.org List-ID: On 13.05.2013 10:05, Stanislav Meduna wrote: > 0d...0 62811.755394: function: do_page_fault > 0....0 62811.755396: function: handle_mm_fault > 0....0 62811.755398: function: handle_pte_fault > 0d...0 62811.755402: function: do_page_fault > 0....0 62811.755404: function: handle_mm_fault > 0....0 62811.755406: function: handle_pte_fault The flags in the pagefault handler are 0x28 - if I understand it correctly, FAULT_FLAG_KILLABLE | FAULT_FLAG_ALLOW_RETRY. The faulting address is indeed the one from stack that worked for hours before, is mlockall()-ed and I have (of course) no swap. I will add some code to print the content of the offending pte. The code in handle_pte_fault proceeds through the entry = pte_mkyoung(entry); line and the following ptep_set_access_flags returns zero. This repeats ad nauseum without anything run in between. I will add some tracing prints to output the content of the pte. Adding flush_tlb_page(vma, address) at the beginning of handle_pte_fault does not change anything. The length of the hang could correlate with the time until some SCHED_OTHER process is scheduled after the RT throttler activates. There is a process running each 2 seconds and the length of the hang is usually between 1 and 3 seconds. This is not (yet) verified. I am starting to think that the virtual memory mapping of the process got somehow corrupted and is fixed at the next regular context switch. There is no switch to other non-kernel process, only to ksoftirqd, irq threads or other thread of the same process afterwards. Shortly before there was some switching between modprobe and kworker and sched_process_free of kworker and modprobe in the RCU softirq. The symptoms are similar to http://lkml.indiana.edu/hypermail/linux/kernel/1103.0/01364.html Regards -- Stano