All of lore.kernel.org
 help / color / mirror / Atom feed
From: Stanislav Meduna <stano@meduna.org>
To: "linux-rt-users@vger.kernel.org" <linux-rt-users@vger.kernel.org>
Cc: rostedt@goodmis.org, Thomas Gleixner <tglx@linutronix.de>,
	Carsten Emde <C.Emde@osadl.org>
Subject: Livelock in handle_pte_fault [Was: Re: timerfd read does not return]
Date: Tue, 14 May 2013 10:31:09 +0200	[thread overview]
Message-ID: <5191F64D.2050509@meduna.org> (raw)
In-Reply-To: <51909EAE.8070901@meduna.org>

On 13.05.2013 10:05, Stanislav Meduna wrote:

> 0d...0 62811.755394: function:  do_page_fault
> 0....0 62811.755396: function:     handle_mm_fault
> 0....0 62811.755398: function:        handle_pte_fault
> 0d...0 62811.755402: function:  do_page_fault
> 0....0 62811.755404: function:     handle_mm_fault
> 0....0 62811.755406: function:        handle_pte_fault

The flags in the pagefault handler are 0x28 - if I understand
it correctly, FAULT_FLAG_KILLABLE | FAULT_FLAG_ALLOW_RETRY.
The faulting address is indeed the one from stack that worked
for hours before, is mlockall()-ed and I have (of course)
no swap. I will add some code to print the content of
the offending pte.

The code in handle_pte_fault proceeds through the
  entry = pte_mkyoung(entry);
line and the following
  ptep_set_access_flags
returns zero. This repeats ad nauseum without anything run
in between. I will add some tracing prints to output
the content of the pte.

Adding flush_tlb_page(vma, address) at the beginning of
handle_pte_fault does not change anything.

The length of the hang could correlate with the time until
some SCHED_OTHER process is scheduled after the RT throttler
activates. There is a process running each 2 seconds and the
length of the hang is usually between 1 and 3 seconds. This
is not (yet) verified.

I am starting to think that the virtual memory mapping of the
process got somehow corrupted and is fixed at the next regular
context switch. There is no switch to other non-kernel process,
only to ksoftirqd, irq threads or other thread of the
same process afterwards. Shortly before there was some
switching between modprobe and kworker and sched_process_free
of kworker and modprobe in the RCU softirq.

The symptoms are similar to
http://lkml.indiana.edu/hypermail/linux/kernel/1103.0/01364.html

Regards
-- 
                                            Stano


  reply	other threads:[~2013-05-14  8:31 UTC|newest]

Thread overview: 19+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2013-04-15 11:02 hrtimer: interrupt took 6742 ns, then RT throttling and hung machine for nearly 2 seconds Stanislav Meduna
2013-04-15 11:54 ` Stanislav Meduna
2013-04-15 12:22 ` Thomas Gleixner
2013-04-15 12:56   ` Stanislav Meduna
2013-04-17 15:46     ` timerfd and softirqd [Was: Re: hrtimer: interrupt took 6742 ns, then RT throttling and hung machine for nearly 2 seconds] Stanislav Meduna
2013-04-18  9:11       ` timerfd read does not return [Was: Re: timerfd and softirqd] Stanislav Meduna
2013-04-19 19:53         ` Stanislav Meduna
2013-04-22  7:35           ` [PATCH] Re: timerfd read does not return Stanislav Meduna
2013-04-22  8:55             ` Stanislav Meduna
2013-04-27  8:34         ` timerfd read does not return - was probably fixed in 3.4.38 Stanislav Meduna
2013-04-28 11:53           ` Carsten Emde
2013-04-29  8:43             ` Stanislav Meduna
2013-05-02 20:02           ` Steven Rostedt
2013-05-10 12:42           ` timerfd read does not return - some traces Stanislav Meduna
2013-05-12 17:31             ` Stanislav Meduna
2013-05-12 23:20             ` timerfd read does not return - hangs inside put_user Stanislav Meduna
2013-05-13  8:05               ` timerfd read does not return - caused by MM fault Stanislav Meduna
2013-05-14  8:31                 ` Stanislav Meduna [this message]
2013-11-25 10:36                   ` Livelock in handle_pte_fault [Was: Re: timerfd read does not return] Vijay Katoch

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=5191F64D.2050509@meduna.org \
    --to=stano@meduna.org \
    --cc=C.Emde@osadl.org \
    --cc=linux-rt-users@vger.kernel.org \
    --cc=rostedt@goodmis.org \
    --cc=tglx@linutronix.de \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.