From: John Blackwood <john.blackwood@ccur.com>
To: Richard Weinberger <richard.weinberger@gmail.com>,
Austin Schuh <austin@peloton-tech.com>
Cc: linux-rt-users@vger.kernel.org, linux-kernel@vger.kernel.org,
xfs@oss.sgi.com
Subject: Re: Filesystem lockup with CONFIG_PREEMPT_RT
Date: Wed, 21 May 2014 14:30:23 -0500 [thread overview]
Message-ID: <537CFECF.9070701@ccur.com> (raw)
> Date: Wed, 21 May 2014 03:33:49 -0400
> From: Richard Weinberger <richard.weinberger@gmail.com>
> To: Austin Schuh <austin@peloton-tech.com>
> CC: LKML <linux-kernel@vger.kernel.org>, xfs <xfs@oss.sgi.com>, rt-users
> <linux-rt-users@vger.kernel.org>
> Subject: Re: Filesystem lockup with CONFIG_PREEMPT_RT
>
> CC'ing RT folks
>
> On Wed, May 21, 2014 at 8:23 AM, Austin Schuh <austin@peloton-tech.com> wrote:
> > > On Tue, May 13, 2014 at 7:29 PM, Austin Schuh <austin@peloton-tech.com> wrote:
> >> >> Hi,
> >> >>
> >> >> I am observing a filesystem lockup with XFS on a CONFIG_PREEMPT_RT
> >> >> patched kernel. I have currently only triggered it using dpkg. Dave
> >> >> Chinner on the XFS mailing list suggested that it was a rt-kernel
> >> >> workqueue issue as opposed to a XFS problem after looking at the
> >> >> kernel messages.
> >> >>
> >> >> The only modification to the kernel besides the RT patch is that I
> >> >> have applied tglx's "genirq: Sanitize spurious interrupt detection of
> >> >> threaded irqs" patch.
> > >
> > > I upgraded to 3.14.3-rt4, and the problem still persists.
> > >
> > > I turned on event tracing and tracked it down further. I'm able to
> > > lock it up by scping a new kernel debian package to /tmp/ on the
> > > machine. scp is locking the inode, and then scheduling
> > > xfs_bmapi_allocate_worker in the work queue. The work then never gets
> > > run. The kworkers then lock up waiting for the inode lock.
> > >
> > > Here are the relevant events from the trace. ffff8803e9f10288
> > > (blk_delay_work) gets run later on in the trace, but ffff8803b4c158d0
> > > (xfs_bmapi_allocate_worker) never does. The kernel then warns about
> > > blocked tasks 120 seconds later.
Austin and Richard,
I'm not 100% sure that the patch below will fix your problem, but we
saw something that sounds pretty familiar to your issue involving the
nvidia driver and the preempt-rt patch. The nvidia driver uses the
completion support to create their own driver's notion of an internally
used semaphore.
Some tasks were failing to ever wakeup from wait_for_completion() calls
due to a race in the underlying do_wait_for_common() routine.
This is the patch that we used to fix this issue:
------------------- -------------------
Fix a race in the PRT wait for completion simple wait code.
A wait_for_completion() waiter task can be awoken by a task calling
complete(), but fail to consume the 'done' completion resource if it
looses a race with another task calling wait_for_completion() just as
it is waking up.
In this case, the awoken task will call schedule_timeout() again
without being in the simple wait queue.
So if the awoken task is unable to claim the 'done' completion resource,
check to see if it needs to be re-inserted into the wait list before
waiting again in schedule_timeout().
Fix-by: John Blackwood <john.blackwood@ccur.com>
Index: b/kernel/sched/core.c
===================================================================
--- a/kernel/sched/core.c
+++ b/kernel/sched/core.c
@@ -3529,11 +3529,19 @@ static inline long __sched
do_wait_for_common(struct completion *x,
long (*action)(long), long timeout, int state)
{
+ int again = 0;
+
if (!x->done) {
DEFINE_SWAITER(wait);
swait_prepare_locked(&x->wait, &wait);
do {
+ /* Check to see if we lost race for 'done' and are
+ * no longer in the wait list.
+ */
+ if (unlikely(again) && list_empty(&wait.node))
+ swait_prepare_locked(&x->wait, &wait);
+
if (signal_pending_state(state, current)) {
timeout = -ERESTARTSYS;
break;
@@ -3542,6 +3550,7 @@ do_wait_for_common(struct completion *x,
raw_spin_unlock_irq(&x->wait.lock);
timeout = action(timeout);
raw_spin_lock_irq(&x->wait.lock);
+ again = 1;
} while (!x->done && timeout);
swait_finish_locked(&x->wait, &wait);
if (!x->done)
_______________________________________________
xfs mailing list
xfs@oss.sgi.com
http://oss.sgi.com/mailman/listinfo/xfs
next reply other threads:[~2014-05-21 19:30 UTC|newest]
Thread overview: 35+ messages / expand[flat|nested] mbox.gz Atom feed top
2014-05-21 19:30 John Blackwood [this message]
2014-05-21 21:59 ` Filesystem lockup with CONFIG_PREEMPT_RT Austin Schuh
2014-07-05 20:36 ` Thomas Gleixner
-- strict thread matches above, loose matches on Subject: below --
2014-07-07 8:48 Jan de Kruyf
2014-07-07 13:00 ` Thomas Gleixner
2014-07-07 16:23 ` Austin Schuh
2014-07-08 8:03 ` Jan de Kruyf
2014-07-08 16:09 ` Austin Schuh
2014-07-05 19:30 Jan de Kruyf
[not found] <CANGgnMbHckBQdKGN_N5Q6qEKc9n1CenxvMpeXog1NbSdL8UrTw@mail.gmail.com>
[not found] ` <CANGgnMYDXerOUDOO9-RHMJKadKACA2KBGskZwoP-1ZwAhDEfVA@mail.gmail.com>
2014-05-21 7:33 ` Richard Weinberger
2014-06-26 19:50 ` Austin Schuh
2014-06-26 22:35 ` Thomas Gleixner
2014-06-27 0:07 ` Austin Schuh
2014-06-27 3:22 ` Mike Galbraith
2014-06-27 12:57 ` Mike Galbraith
2014-06-27 14:01 ` Steven Rostedt
2014-06-27 17:34 ` Mike Galbraith
2014-06-27 17:54 ` Steven Rostedt
2014-06-27 18:07 ` Mike Galbraith
2014-06-27 18:19 ` Steven Rostedt
2014-06-27 19:11 ` Mike Galbraith
2014-06-28 1:18 ` Austin Schuh
2014-06-28 3:32 ` Mike Galbraith
2014-06-28 6:20 ` Austin Schuh
2014-06-28 7:11 ` Mike Galbraith
2014-06-27 14:24 ` Thomas Gleixner
2014-06-28 4:51 ` Mike Galbraith
2014-07-01 0:12 ` Austin Schuh
2014-07-01 0:53 ` Austin Schuh
2014-07-05 20:26 ` Thomas Gleixner
2014-07-06 4:55 ` Austin Schuh
2014-07-01 3:01 ` Austin Schuh
2014-07-01 19:32 ` Austin Schuh
2014-07-03 23:08 ` Austin Schuh
2014-07-04 4:42 ` Mike Galbraith
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=537CFECF.9070701@ccur.com \
--to=john.blackwood@ccur.com \
--cc=austin@peloton-tech.com \
--cc=linux-kernel@vger.kernel.org \
--cc=linux-rt-users@vger.kernel.org \
--cc=richard.weinberger@gmail.com \
--cc=xfs@oss.sgi.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).