From: Joakim Hernberg <jhernberg@alchemy.lu>
To: linux-rt-users@vger.kernel.org
Subject: Re: RT is freezing
Date: Wed, 7 Jan 2015 11:24:23 +0100 [thread overview]
Message-ID: <20150107112423.228e67f3@balder.valhalla.alchemy.lu> (raw)
In-Reply-To: <54AB39D2.2090203@gmail.com>
On Mon, 05 Jan 2015 23:26:42 -0200
Gustavo Bittencourt <gbitten@gmail.com> wrote:
> It seems that the problem is with the nouveau driver. When I boot in
> failsafe graphic mode, the system works well. Here is my video
> configuration:
> $ lshw -c video
> *-display
> description: VGA compatible controller
> product: GF108M [GeForce GT 540M]
> vendor: NVIDIA Corporation
> physical id: 0
> bus info: pci@0000:01:00.0
> version: a1
> width: 64 bits
> clock: 33MHz
> capabilities: pm msi pciexpress vga_controller bus_master
> cap_list rom
> configuration: driver=nouveau latency=0
> resources: irq:53 memory:f4000000-f4ffffff
> memory:d0000000-dfffffff memory:e0000000-e1ffffff
> ioport:d000(size=128) memory:f5000000-f507ffff
>
>
> On 01/05/2015 08:47 PM, Gustavo Bittencourt wrote:
> > Hi everybody
> >
> > I compiled the 3.14.25-rt22, but my system freezes when I start
> > Unity and some programs like Chrome or Thunderbird. The problem
> > happens only when PREEMPT_RT_FULL=y. No log is generated. I would
> > like to find the root of this problem, but I don't know how. Do you
> > have any suggestion?
I don't know if this is related, and I'm sorry for mentioning nvidia on
the mailinglist, but if it applies to nouveau too, I hope it's
alright :)
I have the same experience using the nvidia driver on a test system.
This patch was brought to my attention and I use it for Archlinux'
realtime kernel. It appears to fix the X hangs on my nvidia test
machine (note that for me it's just X that hangs):
-NOTE: this patch is a rebase of John Blackwood's patch. On his kernel, he must be using
-an older simple wait patch - as his applies to kernel/sched/core.c, while the simple wait
-completion code lives in kernel/sched/completion.c ... I have ported this to test with
-nvidia, as i would like to see if it fixes the semaphore issues i have seen.
-I've kept the original patch comment in tact;
I'm not 100% sure that the patch below will fix your problem, but we
saw something that sounds pretty familiar to your issue involving the
nvidia driver and the preempt-rt patch. The nvidia driver uses the
completion support to create their own driver's notion of an internally
used semaphore.
Fix a race in the PRT wait for completion simple wait code.
A wait_for_completion() waiter task can be awoken by a task calling
complete(), but fail to consume the 'done' completion resource if it
looses a race with another task calling wait_for_completion() just as
it is waking up.
In this case, the awoken task will call schedule_timeout() again
without being in the simple wait queue.
So if the awoken task is unable to claim the 'done' completion resource,
check to see if it needs to be re-inserted into the wait list before
waiting again in schedule_timeout().
Fix-by: John Blackwood <john.blackwood@ccur.com>
--- linux-3.14/kernel/sched/completion.c 2014-05-22 14:01:03.879734869 -0400
+++ linux-3.14/kernel/sched/completion.c 2014-05-22 14:13:59.181688658 -0400
@@ -61,11 +61,19 @@
do_wait_for_common(struct completion *x,
long (*action)(long), long timeout, int state)
{
+ int again = 0;
+
if (!x->done) {
DEFINE_SWAITER(wait);
swait_prepare_locked(&x->wait, &wait);
do {
+ /* Check to see if we lost race for 'done' and are
+ * no longer in the wait list.
+ */
+ if (unlikely(again) && list_empty(&wait.node))
+ swait_prepare_locked(&x->wait, &wait);
+
if (signal_pending_state(state, current)) {
timeout = -ERESTARTSYS;
break;
@@ -74,6 +82,7 @@
raw_spin_unlock_irq(&x->wait.lock);
timeout = action(timeout);
raw_spin_lock_irq(&x->wait.lock);
+ again = 1;
} while (!x->done && timeout);
swait_finish_locked(&x->wait, &wait);
if (!x->done)
--
Joakim
next prev parent reply other threads:[~2015-01-07 10:41 UTC|newest]
Thread overview: 6+ messages / expand[flat|nested] mbox.gz Atom feed top
2015-01-05 22:47 RT is freezing Gustavo Bittencourt
2015-01-06 1:26 ` Gustavo Bittencourt
2015-01-07 10:24 ` Joakim Hernberg [this message]
2015-01-07 23:39 ` Gustavo Bittencourt
2015-02-17 17:16 ` Sebastian Andrzej Siewior
2015-02-18 0:40 ` Gustavo Bittencourt
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20150107112423.228e67f3@balder.valhalla.alchemy.lu \
--to=jhernberg@alchemy.lu \
--cc=linux-rt-users@vger.kernel.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).