linux-rt-users.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: Stanislav Meduna <stano@meduna.org>
To: "linux-rt-users@vger.kernel.org" <linux-rt-users@vger.kernel.org>
Cc: rostedt@goodmis.org, Thomas Gleixner <tglx@linutronix.de>,
	Carsten Emde <C.Emde@osadl.org>
Subject: Re: timerfd read does not return - hangs inside put_user
Date: Mon, 13 May 2013 01:20:32 +0200	[thread overview]
Message-ID: <519023C0.2030603@meduna.org> (raw)
In-Reply-To: <518CEB45.9080705@meduna.org>

On 10.05.2013 14:42, Stanislav Meduna wrote:

> 49762.928029 the timerfd thread gets the CPU, does not giveit back,
> but the end of the syscall is not reached in a sane time.

Hmm... I added some trace_printks into fs/timerfd.c:


static ssize_t timerfd_read(
{
...
	trace_printk("timerfd_read before unlock, res=%d", res);
	spin_unlock_irq(&ctx->wqh.lock);
	trace_printk("timerfd_read after unlock, res=%d", res);
	if (ticks)
		res = put_user(ticks, (u64 __user *) buf) ? -EFAULT: sizeof(ticks);
	trace_printk("timerfd_read return, res=%d, ticks=%llu, buf=%p", res, (unsigned long long) ticks, buf);
	return res;
}

The first two are printed. The last one is not.

0....0 timerfd_read: timerfd_read before unlock, res=0
0....0 timerfd_read: timerfd_read after unlock, res=0

The usual

0....0 timerfd_read: timerfd_read return, res=8, ticks=1, buf=0xb7600158

does _not_ happen here.

So it looks like the problem happens inside the put_user,
maybe a pagefault? The buf is an address on the stack:

while(alive) {
	u64 exp;

	if (read(tmrFd, &exp, sizeof(exp)) != sizeof(exp) || exp < 1)
		continue;
...

and the process is running with mlockall(MCL_CURRENT | MCL_FUTURE),
so the whole stack is forced into the RAM.

Viewing the ps -o min_flt,maj_flt for the task shows 969701
minor faults (that do not increment - I will check this when
the hang happens again) and 0 major ones.

This starts to look like some priority inversion where a realtime
thread is busy-waiting for something a SCHED_OTHER thread holds.
After the RT throttler kicks in, that other thread can proceed
and the system eventually recovers - unfortunately after
a 1+ second pause, which is what I am seeing.

Is there something that could cause stalling the whole realtime
thread for a significant time? Unfortunately I am not an expert
on memory management, pagefaults etc - but I guess this should
only produce a minor pagefault and that should never block?
Or am I seeing it wrong?

Regards
-- 
                                     Stano


  parent reply	other threads:[~2013-05-12 23:20 UTC|newest]

Thread overview: 19+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2013-04-15 11:02 hrtimer: interrupt took 6742 ns, then RT throttling and hung machine for nearly 2 seconds Stanislav Meduna
2013-04-15 11:54 ` Stanislav Meduna
2013-04-15 12:22 ` Thomas Gleixner
2013-04-15 12:56   ` Stanislav Meduna
2013-04-17 15:46     ` timerfd and softirqd [Was: Re: hrtimer: interrupt took 6742 ns, then RT throttling and hung machine for nearly 2 seconds] Stanislav Meduna
2013-04-18  9:11       ` timerfd read does not return [Was: Re: timerfd and softirqd] Stanislav Meduna
2013-04-19 19:53         ` Stanislav Meduna
2013-04-22  7:35           ` [PATCH] Re: timerfd read does not return Stanislav Meduna
2013-04-22  8:55             ` Stanislav Meduna
2013-04-27  8:34         ` timerfd read does not return - was probably fixed in 3.4.38 Stanislav Meduna
2013-04-28 11:53           ` Carsten Emde
2013-04-29  8:43             ` Stanislav Meduna
2013-05-02 20:02           ` Steven Rostedt
2013-05-10 12:42           ` timerfd read does not return - some traces Stanislav Meduna
2013-05-12 17:31             ` Stanislav Meduna
2013-05-12 23:20             ` Stanislav Meduna [this message]
2013-05-13  8:05               ` timerfd read does not return - caused by MM fault Stanislav Meduna
2013-05-14  8:31                 ` Livelock in handle_pte_fault [Was: Re: timerfd read does not return] Stanislav Meduna
2013-11-25 10:36                   ` Vijay Katoch

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=519023C0.2030603@meduna.org \
    --to=stano@meduna.org \
    --cc=C.Emde@osadl.org \
    --cc=linux-rt-users@vger.kernel.org \
    --cc=rostedt@goodmis.org \
    --cc=tglx@linutronix.de \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).