From: Thomas Schauss <schauss@tum.de>
To: Thomas Gleixner <tglx@linutronix.de>
Cc: RT <linux-rt-users@vger.kernel.org>
Subject: Re: 3.2-rc1 and nvidia drivers
Date: Mon, 28 Nov 2011 11:08:26 +0100 [thread overview]
Message-ID: <4ED35D9A.7090401@tum.de> (raw)
In-Reply-To: <alpine.LFD.2.02.1111161605580.4902@ionos>
[-- Attachment #1: Type: text/plain, Size: 6379 bytes --]
On 11/16/2011 04:06 PM, Thomas Gleixner wrote:
> On Wed, 16 Nov 2011, Thomas Schauss wrote:
>> Unfortunately, with 3.0-rt and the nvidia-driver we get complete system
>> freezes when starting X on several different hardware setups (a few systems
>> work fine). This is certainly caused by this combination. When using the
>> nouveau-driver everything works fine.
>
> Have you ever tried to run with CONFIG_PROVE_LOCKING=y ?
>
Hello,
thank you for that tip. I have tried this now and have not found any
warnings which seem related to the nvidia-driver. Further testing
revealed, that the driver works fine with CONFIG_PREEMPT_RTB and the
freezes when running startx occur as soon as we switch to
CONFIG_PREEMPT_RT_FULL.
Regarding lockdep, we do get some warnings in slab.c -> cache_flusharray
that however seem unrelated to nvidia. As we could not find any other
bugs with the same locking warning I attached one example below. You can
find some complete bootlogs (all with deadlock-warnings, all with
slightly different call-stack) and my kernel-config at
http://www.lsr.ei.tum.de/team/schauss/lockdep/
On rt-base I also get a lockdep-warning which however seems unrelated to
the rt-full one (not in cache_flusharray). You can find that log on the
same page.
Best Regards,
Thomas
Nov 17 17:34:49 fix kernel: [ 30.750925]
=============================================
Nov 17 17:34:49 fix kernel: [ 30.750927] [ INFO: possible recursive
locking detected ]
Nov 17 17:34:49 fix kernel: [ 30.750930] 3.0.9-25-rt #0
Nov 17 17:34:49 fix kernel: [ 30.750931]
---------------------------------------------
Nov 17 17:34:49 fix kernel: [ 30.750933] udevd/517 is trying to
acquire lock:
Nov 17 17:34:49 fix kernel: [ 30.750935]
(&parent->list_lock){+.+...}, at: [<ffffffff81613e63>]
cache_flusharray+0x47/0xd6
Nov 17 17:34:49 fix kernel: [ 30.750944]
Nov 17 17:34:49 fix kernel: [ 30.750945] but task is already holding lock:
Nov 17 17:34:49 fix kernel: [ 30.750946]
(&parent->list_lock){+.+...}, at: [<ffffffff81613e63>]
cache_flusharray+0x47/0xd6
Nov 17 17:34:49 fix kernel: [ 30.750950]
Nov 17 17:34:49 fix kernel: [ 30.750951] other info that might help us
debug this:
Nov 17 17:34:49 fix kernel: [ 30.750952] Possible unsafe locking
scenario:
Nov 17 17:34:49 fix kernel: [ 30.750953]
Nov 17 17:34:49 fix kernel: [ 30.750954] CPU0
Nov 17 17:34:49 fix kernel: [ 30.750955] ----
Nov 17 17:34:49 fix kernel: [ 30.750956] lock(&parent->list_lock);
Nov 17 17:34:49 fix kernel: [ 30.750958] lock(&parent->list_lock);
Nov 17 17:34:49 fix kernel: [ 30.750959]
Nov 17 17:34:49 fix kernel: [ 30.750960] *** DEADLOCK ***
Nov 17 17:34:49 fix kernel: [ 30.750961]
Nov 17 17:34:49 fix kernel: [ 30.750962] May be due to missing lock
nesting notation
Nov 17 17:34:49 fix kernel: [ 30.750963]
Nov 17 17:34:49 fix kernel: [ 30.750964] 2 locks held by udevd/517:
Nov 17 17:34:49 fix kernel: [ 30.750966] #0: (&per_cpu(slab_lock,
__cpu).lock){+.+...}, at: [<ffffffff8116a5c6>] kfree+0xd6/0x380
Nov 17 17:34:49 fix kernel: [ 30.750973] #1:
(&parent->list_lock){+.+...}, at: [<ffffffff81613e63>]
cache_flusharray+0x47/0xd6
Nov 17 17:34:49 fix kernel: [ 30.750977]
Nov 17 17:34:49 fix kernel: [ 30.750977] stack backtrace:
Nov 17 17:34:49 fix kernel: [ 30.750980] Pid: 517, comm: udevd Not
tainted 3.0.9-25-rt #0
Nov 17 17:34:49 fix kernel: [ 30.750982] Call Trace:
Nov 17 17:34:49 fix kernel: [ 30.750987] [<ffffffff810a0097>]
print_deadlock_bug+0xf7/0x100
Nov 17 17:34:49 fix kernel: [ 30.750991] [<ffffffff810a1add>]
validate_chain.isra.37+0x67d/0x720
Nov 17 17:34:49 fix kernel: [ 30.750995] [<ffffffff810a2478>]
__lock_acquire+0x478/0x9c0
Nov 17 17:34:49 fix kernel: [ 30.750999] [<ffffffff8162ae19>] ?
sub_preempt_count+0x29/0x60
Nov 17 17:34:49 fix kernel: [ 30.751003] [<ffffffff81627475>] ?
_raw_spin_unlock+0x35/0x60
Nov 17 17:34:49 fix kernel: [ 30.751007] [<ffffffff81625f0b>] ?
rt_spin_lock_slowlock+0x2eb/0x340
Nov 17 17:34:49 fix kernel: [ 30.751011] [<ffffffff81056be1>] ?
get_parent_ip+0x11/0x50
Nov 17 17:34:49 fix kernel: [ 30.751014] [<ffffffff81613e63>] ?
cache_flusharray+0x47/0xd6
Nov 17 17:34:49 fix kernel: [ 30.751015] [<ffffffff810a2f64>]
lock_acquire+0x94/0x160
Nov 17 17:34:49 fix kernel: [ 30.751015] [<ffffffff81613e63>] ?
cache_flusharray+0x47/0xd6
Nov 17 17:34:49 fix kernel: [ 30.751015] [<ffffffff81626999>]
rt_spin_lock+0x39/0x40
Nov 17 17:34:49 fix kernel: [ 30.751015] [<ffffffff81613e63>] ?
cache_flusharray+0x47/0xd6
Nov 17 17:34:49 fix kernel: [ 30.751015] [<ffffffff8105a90b>] ?
migrate_disable+0x6b/0xe0
Nov 17 17:34:49 fix kernel: [ 30.751015] [<ffffffff81613e63>]
cache_flusharray+0x47/0xd6
Nov 17 17:34:49 fix kernel: [ 30.751015] [<ffffffff81167a41>]
kmem_cache_free+0x221/0x300
Nov 17 17:34:49 fix kernel: [ 30.751015] [<ffffffff81167b8f>]
slab_destroy+0x6f/0xa0
Nov 17 17:34:49 fix kernel: [ 30.751015] [<ffffffff81167d32>]
free_block+0x172/0x190
Nov 17 17:34:49 fix kernel: [ 30.751015] [<ffffffff81613eb4>]
cache_flusharray+0x98/0xd6
Nov 17 17:34:49 fix kernel: [ 30.751015] [<ffffffff814f1110>] ?
__sk_free+0x130/0x160
Nov 17 17:34:49 fix kernel: [ 30.751015] [<ffffffff814f1110>] ?
__sk_free+0x130/0x160
Nov 17 17:34:49 fix kernel: [ 30.751015] [<ffffffff8116a806>]
kfree+0x316/0x380
Nov 17 17:34:49 fix kernel: [ 30.751015] [<ffffffff814f5328>] ?
skb_queue_purge+0x28/0x40
Nov 17 17:34:49 fix kernel: [ 30.751015] [<ffffffff814f1110>]
__sk_free+0x130/0x160
Nov 17 17:34:49 fix kernel: [ 30.751015] [<ffffffff814f11d5>]
sk_free+0x25/0x30
Nov 17 17:34:49 fix kernel: [ 30.751015] [<ffffffff8152d908>]
netlink_release+0x128/0x200
Nov 17 17:34:49 fix kernel: [ 30.751015] [<ffffffff814ea388>]
sock_release+0x28/0x90
Nov 17 17:34:49 fix kernel: [ 30.751015] [<ffffffff814eaa57>]
sock_close+0x17/0x30
Nov 17 17:34:49 fix kernel: [ 30.751015] [<ffffffff8117b914>]
__fput+0xb4/0x200
Nov 17 17:34:49 fix kernel: [ 30.751015] [<ffffffff8117ba85>]
fput+0x25/0x30
Nov 17 17:34:49 fix kernel: [ 30.751015] [<ffffffff81177d0c>]
filp_close+0x6c/0x90
Nov 17 17:34:49 fix kernel: [ 30.751015] [<ffffffff81177df0>]
sys_close+0xc0/0x130
Nov 17 17:34:49 fix kernel: [ 30.751015] [<ffffffff8162ed02>]
system_call_fastpath+0x16/0x1b
[-- Attachment #2: schauss.vcf --]
[-- Type: text/x-vcard, Size: 342 bytes --]
begin:vcard
fn:Thomas Schauss
n:Schauss;Thomas
org:Technische Universitaet Muenchen (TUM);Institute of Automatic Control Engineering (LSR)
adr:;;Theresienstr. 90;Munich;;80333;Germany
email;internet:schauss@tum.de
title:Dipl.-Ing. (Univ.)
tel;work:+49 89 289 23406
tel;fax:+49 89 289 28340
url:http://www.lsr.ei.tum.de
version:2.1
end:vcard
next prev parent reply other threads:[~2011-11-28 10:08 UTC|newest]
Thread overview: 24+ messages / expand[flat|nested] mbox.gz Atom feed top
2011-11-16 9:10 3.2-rc1 and nvidia drivers Javier Sanz
2011-11-16 9:40 ` Thomas Schauss
2011-11-16 15:06 ` Thomas Gleixner
2011-11-28 10:08 ` Thomas Schauss [this message]
2011-11-28 11:31 ` John Kacur
2011-11-29 14:31 ` John Kacur
2011-11-30 2:36 ` Steven Rostedt
2011-11-30 8:23 ` John Kacur
2011-11-30 11:14 ` Peter Zijlstra
2011-11-30 14:14 ` Steven Rostedt
2011-11-30 14:16 ` Peter Zijlstra
2011-11-30 14:28 ` Steven Rostedt
2011-11-30 14:31 ` Steven Rostedt
2011-11-30 14:34 ` Peter Zijlstra
2011-11-30 15:07 ` Thomas Schauss
2011-11-30 15:20 ` Steven Rostedt
2011-12-02 17:41 ` Thomas Schauss
2011-12-02 19:37 ` Steven Rostedt
2011-11-30 13:34 ` Steven Rostedt
2011-11-30 13:39 ` John Kacur
2011-11-30 13:49 ` Steven Rostedt
2011-11-30 13:53 ` John Kacur
2011-11-30 9:06 ` Thomas Schauss
2011-11-16 9:52 ` Mike Galbraith
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=4ED35D9A.7090401@tum.de \
--to=schauss@tum.de \
--cc=linux-rt-users@vger.kernel.org \
--cc=tglx@linutronix.de \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.