From mboxrd@z Thu Jan 1 00:00:00 1970 From: Martin Peres Subject: Re: nouveau_fan_update: possible circular locking dependency detected Date: Thu, 13 Mar 2014 14:54:55 +0100 Message-ID: <5321B8AF.60001@labri.fr> References: <20140309145157.GA504@joi.home> Mime-Version: 1.0 Content-Type: text/plain; charset="iso-8859-1"; Format="flowed" Content-Transfer-Encoding: quoted-printable Return-path: In-Reply-To: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Sender: nouveau-bounces-PD4FTy7X32lNgt0PjOBp9y5qC8QIuHrW@public.gmane.org Errors-To: nouveau-bounces-PD4FTy7X32lNgt0PjOBp9y5qC8QIuHrW@public.gmane.org To: Ilia Mirkin , Marcin Slusarz Cc: "nouveau-PD4FTy7X32lNgt0PjOBp9y5qC8QIuHrW@public.gmane.org" , "dri-devel-PD4FTy7X32lNgt0PjOBp9y5qC8QIuHrW@public.gmane.org" List-Id: nouveau.vger.kernel.org Le 13/03/2014 14:38, Ilia Mirkin a =E9crit : > On Sun, Mar 9, 2014 at 10:51 AM, Marcin Slusarz > wrote: >> [ 326.168487] =3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D= =3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D= =3D=3D=3D=3D=3D=3D=3D=3D=3D=3D >> [ 326.168491] [ INFO: possible circular locking dependency detected ] >> [ 326.168496] 3.13.6 #1270 Not tainted >> [ 326.168500] ------------------------------------------------------- >> [ 326.168504] ldconfig/22297 is trying to acquire lock: >> [ 326.168507] (&(&priv->fan->lock)->rlock){-.-...}, at: [] nouveau_fan_update+0xeb/0x252 [nouveau] >> [ 326.168551] >> but task is already holding lock: >> [ 326.168555] (&(&priv->sensor.alarm_program_lock)->rlock){-.-...}, at= : [] alarm_timer_callback+0xf1/0x179 [nouveau] >> [ 326.168587] >> which lock already depends on the new lock. >> >> [ 326.168592] >> the existing dependency chain (in reverse order) is: >> [ 326.168596] >> -> #1 (&(&priv->sensor.alarm_program_lock)->rlock){-.-...}: >> [ 326.168606] [] lock_acquire+0xce/0x117 >> [ 326.168615] [] _raw_spin_lock_irqsave+0x3f/0= x51 >> [ 326.168623] [] alarm_timer_callback+0xf1/0x1= 79 [nouveau] >> [ 326.168651] [] nv04_timer_alarm_trigger+0x1b= 1/0x1cb [nouveau] >> [ 326.168679] [] nv04_timer_alarm+0xb5/0xbe [n= ouveau] >> [ 326.168708] [] nouveau_fan_update+0x234/0x25= 2 [nouveau] >> [ 326.168735] [] nouveau_fan_alarm+0x15/0x17 [= nouveau] >> [ 326.168763] [] nv04_timer_alarm_trigger+0x1b= 1/0x1cb [nouveau] >> [ 326.168790] [] nv04_timer_intr+0x5b/0x13c [n= ouveau] >> [ 326.168817] [] nouveau_mc_intr+0x2e2/0x3b1 [= nouveau] >> [ 326.168838] [] handle_irq_event_percpu+0x5c/= 0x1dc >> [ 326.168846] [] handle_irq_event+0x3c/0x5c >> [ 326.168852] [] handle_edge_irq+0xc4/0xeb >> [ 326.168860] [] handle_irq+0x120/0x12d >> [ 326.168868] [] do_IRQ+0x48/0xaf >> [ 326.168873] [] ret_from_intr+0x0/0x13 >> [ 326.168881] [] arch_cpu_idle+0x13/0x1d >> [ 326.168887] [] cpu_startup_entry+0x140/0x218 >> [ 326.168895] [] start_secondary+0x1bf/0x1c4 >> [ 326.168902] >> -> #0 (&(&priv->fan->lock)->rlock){-.-...}: >> [ 326.168913] [] __lock_acquire+0x10be/0x182b >> [ 326.168920] [] lock_acquire+0xce/0x117 >> [ 326.168924] [] _raw_spin_lock_irqsave+0x3f/0= x51 >> [ 326.168931] [] nouveau_fan_update+0xeb/0x252= [nouveau] >> [ 326.168958] [] nouveau_therm_fan_set+0x14/0x= 16 [nouveau] >> [ 326.168984] [] nouveau_therm_update+0x303/0x= 312 [nouveau] >> [ 326.169011] [] nouveau_therm_alarm+0x13/0x15= [nouveau] >> [ 326.169038] [] nv04_timer_alarm_trigger+0x1b= 1/0x1cb [nouveau] >> [ 326.169059] [] nv04_timer_alarm+0xb5/0xbe [n= ouveau] >> [ 326.169079] [] alarm_timer_callback+0x15e/0x= 179 [nouveau] >> [ 326.169101] [] nv04_timer_alarm_trigger+0x1b= 1/0x1cb [nouveau] >> [ 326.169121] [] nv04_timer_intr+0x5b/0x13c [n= ouveau] >> [ 326.169142] [] nouveau_mc_intr+0x2e2/0x3b1 [= nouveau] >> [ 326.169160] [] handle_irq_event_percpu+0x5c/= 0x1dc >> [ 326.169165] [] handle_irq_event+0x3c/0x5c >> [ 326.169170] [] handle_edge_irq+0xc4/0xeb >> [ 326.169175] [] handle_irq+0x120/0x12d >> [ 326.169179] [] do_IRQ+0x48/0xaf >> [ 326.169183] [] ret_from_intr+0x0/0x13 >> [ 326.169189] >> other info that might help us debug this: >> >> [ 326.169193] Possible unsafe locking scenario: >> >> [ 326.169195] CPU0 CPU1 >> [ 326.169197] ---- ---- >> [ 326.169199] lock(&(&priv->sensor.alarm_program_lock)->rlock); >> [ 326.169205] lock(&(&priv->fan->lock)->= rlock); >> [ 326.169211] lock(&(&priv->sensor.alarm= _program_lock)->rlock); >> [ 326.169216] lock(&(&priv->fan->lock)->rlock); >> [ 326.169221] >> *** DEADLOCK *** >> >> [ 326.169225] 1 lock held by ldconfig/22297: >> [ 326.169229] #0: (&(&priv->sensor.alarm_program_lock)->rlock){-.-.= ..}, at: [] alarm_timer_callback+0xf1/0x179 [nouveau] >> [ 326.169253] >> stack backtrace: >> [ 326.169258] CPU: 7 PID: 22297 Comm: ldconfig Not tainted 3.13.6 #12= 70 >> [ 326.169260] Hardware name: System manufacturer System Product Name/= P6T SE, BIOS 0603 09/02/2009 >> [ 326.169264] ffffffff90fb6360 ffff8801bfdc3a38 ffffffff9059e369 000= 0000000000006 >> [ 326.169273] ffffffff90fb61b0 ffff8801bfdc3a88 ffffffff905998cf 000= 0000000000002 >> [ 326.169282] ffff8800b148dbe0 0000000000000001 ffff8800b148e1e0 000= 0000000000001 >> [ 326.169342] Call Trace: >> [ 326.169344] [] dump_stack+0x4e/0x71 >> [ 326.169352] [] print_circular_bug+0x2ad/0x2be >> [ 326.169356] [] __lock_acquire+0x10be/0x182b >> [ 326.169360] [] ? check_irq_usage+0x99/0xab >> [ 326.169365] [] lock_acquire+0xce/0x117 >> [ 326.169384] [] ? nouveau_fan_update+0xeb/0x252 [= nouveau] >> [ 326.169388] [] _raw_spin_lock_irqsave+0x3f/0x51 >> [ 326.169407] [] ? nouveau_fan_update+0xeb/0x252 [= nouveau] >> [ 326.169426] [] ? nv04_timer_alarm_trigger+0x18d/= 0x1cb [nouveau] >> [ 326.169445] [] nouveau_fan_update+0xeb/0x252 [no= uveau] >> [ 326.169465] [] nouveau_therm_fan_set+0x14/0x16 [= nouveau] >> [ 326.169483] [] nouveau_therm_update+0x303/0x312 = [nouveau] >> [ 326.169502] [] nouveau_therm_alarm+0x13/0x15 [no= uveau] >> [ 326.169521] [] nv04_timer_alarm_trigger+0x1b1/0x= 1cb [nouveau] >> [ 326.169541] [] nv04_timer_alarm+0xb5/0xbe [nouve= au] >> [ 326.169560] [] alarm_timer_callback+0x15e/0x179 = [nouveau] >> [ 326.169579] [] nv04_timer_alarm_trigger+0x1b1/0x= 1cb [nouveau] >> [ 326.169598] [] nv04_timer_intr+0x5b/0x13c [nouve= au] >> [ 326.169617] [] nouveau_mc_intr+0x2e2/0x3b1 [nouv= eau] >> [ 326.169621] [] handle_irq_event_percpu+0x5c/0x1dc >> [ 326.169624] [] handle_irq_event+0x3c/0x5c >> [ 326.169628] [] handle_edge_irq+0xc4/0xeb >> [ 326.169631] [] handle_irq+0x120/0x12d >> [ 326.169636] [] ? irq_enter+0x13/0x64 >> [ 326.169640] [] do_IRQ+0x48/0xaf >> [ 326.169644] [] common_interrupt+0x6f/0x6f >> [ 326.169646] [] ? retint_swapgs+0xe/0x13 > > Marcin, how reproducible is this? What hardware was this on? If it's > reasonably reproducible perhaps it makes sense to file a bug in the > fd.o tracker? > > Martin, I think this is in code you've written (right?). Perhaps you > can take a look? All that alarm/update/etc code that ends up > immediately dispatching itself seems like a locking nightmare... > > -ilia Hey Ilia, I'll have a look at it tonight. Yes, this is a little nightmarish :s Martin