All of lore.kernel.org
 help / color / mirror / Atom feed
From: Stanislav Meduna <stano@meduna.org>
To: "linux-rt-users@vger.kernel.org" <linux-rt-users@vger.kernel.org>
Cc: rostedt@goodmis.org, Thomas Gleixner <tglx@linutronix.de>,
	Carsten Emde <C.Emde@osadl.org>
Subject: Re: timerfd read does not return - hangs inside put_user
Date: Mon, 13 May 2013 01:20:32 +0200	[thread overview]
Message-ID: <519023C0.2030603@meduna.org> (raw)
In-Reply-To: <518CEB45.9080705@meduna.org>

On 10.05.2013 14:42, Stanislav Meduna wrote:

> 49762.928029 the timerfd thread gets the CPU, does not giveit back,
> but the end of the syscall is not reached in a sane time.

Hmm... I added some trace_printks into fs/timerfd.c:


static ssize_t timerfd_read(
{
...
	trace_printk("timerfd_read before unlock, res=%d", res);
	spin_unlock_irq(&ctx->wqh.lock);
	trace_printk("timerfd_read after unlock, res=%d", res);
	if (ticks)
		res = put_user(ticks, (u64 __user *) buf) ? -EFAULT: sizeof(ticks);
	trace_printk("timerfd_read return, res=%d, ticks=%llu, buf=%p", res, (unsigned long long) ticks, buf);
	return res;
}

The first two are printed. The last one is not.

0....0 timerfd_read: timerfd_read before unlock, res=0
0....0 timerfd_read: timerfd_read after unlock, res=0

The usual

0....0 timerfd_read: timerfd_read return, res=8, ticks=1, buf=0xb7600158

does _not_ happen here.

So it looks like the problem happens inside the put_user,
maybe a pagefault? The buf is an address on the stack:

while(alive) {
	u64 exp;

	if (read(tmrFd, &exp, sizeof(exp)) != sizeof(exp) || exp < 1)
		continue;
...

and the process is running with mlockall(MCL_CURRENT | MCL_FUTURE),
so the whole stack is forced into the RAM.

Viewing the ps -o min_flt,maj_flt for the task shows 969701
minor faults (that do not increment - I will check this when
the hang happens again) and 0 major ones.

This starts to look like some priority inversion where a realtime
thread is busy-waiting for something a SCHED_OTHER thread holds.
After the RT throttler kicks in, that other thread can proceed
and the system eventually recovers - unfortunately after
a 1+ second pause, which is what I am seeing.

Is there something that could cause stalling the whole realtime
thread for a significant time? Unfortunately I am not an expert
on memory management, pagefaults etc - but I guess this should
only produce a minor pagefault and that should never block?
Or am I seeing it wrong?

Regards
-- 
                                     Stano


  parent reply	other threads:[~2013-05-12 23:20 UTC|newest]

Thread overview: 19+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2013-04-15 11:02 hrtimer: interrupt took 6742 ns, then RT throttling and hung machine for nearly 2 seconds Stanislav Meduna
2013-04-15 11:54 ` Stanislav Meduna
2013-04-15 12:22 ` Thomas Gleixner
2013-04-15 12:56   ` Stanislav Meduna
2013-04-17 15:46     ` timerfd and softirqd [Was: Re: hrtimer: interrupt took 6742 ns, then RT throttling and hung machine for nearly 2 seconds] Stanislav Meduna
2013-04-18  9:11       ` timerfd read does not return [Was: Re: timerfd and softirqd] Stanislav Meduna
2013-04-19 19:53         ` Stanislav Meduna
2013-04-22  7:35           ` [PATCH] Re: timerfd read does not return Stanislav Meduna
2013-04-22  8:55             ` Stanislav Meduna
2013-04-27  8:34         ` timerfd read does not return - was probably fixed in 3.4.38 Stanislav Meduna
2013-04-28 11:53           ` Carsten Emde
2013-04-29  8:43             ` Stanislav Meduna
2013-05-02 20:02           ` Steven Rostedt
2013-05-10 12:42           ` timerfd read does not return - some traces Stanislav Meduna
2013-05-12 17:31             ` Stanislav Meduna
2013-05-12 23:20             ` Stanislav Meduna [this message]
2013-05-13  8:05               ` timerfd read does not return - caused by MM fault Stanislav Meduna
2013-05-14  8:31                 ` Livelock in handle_pte_fault [Was: Re: timerfd read does not return] Stanislav Meduna
2013-11-25 10:36                   ` Vijay Katoch

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=519023C0.2030603@meduna.org \
    --to=stano@meduna.org \
    --cc=C.Emde@osadl.org \
    --cc=linux-rt-users@vger.kernel.org \
    --cc=rostedt@goodmis.org \
    --cc=tglx@linutronix.de \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.