* lockdep false positive? -- firewire-core transaction timer vs. scsi-core host lock
@ 2010-08-16 18:42 Stefan Richter
2010-08-16 21:42 ` Peter Zijlstra
2010-08-17 14:35 ` Yong Zhang
0 siblings, 2 replies; 17+ messages in thread
From: Stefan Richter @ 2010-08-16 18:42 UTC (permalink / raw)
To: linux1394-devel; +Cc: linux-kernel
Hi, I just saw the following (twice, with a reboot in between) with
kernel 2.6.35 plus firewire driver patched to their state in 2.6.36-rc1,
i.e. basically with a kernel according to commit e78483c5.
I suspect it might also happen without firewire patches, as 2.6.35
already contains the change where firewire-core's per-transaction timer
was introduced which shows up in the following log.
It happened while I was testing hotplugging of FireWire disks with
several ones on a bus together with several FireWire 800 repeaters,
which sometimes act up so that reconnect does not succeed immediately.
Perhaps such reconnection troubles are necessary to provoke this lockdep
report.
[I only accidentally discovered this on a test PC which do not use very
frequently. Unfortunately, I had lockdep disabled on my main PC on
which I test most of my FireWire hardware for quite a while now. I
guess I should enable lockdep there again, but I have a suspicion that I
won't leave it on for too long.]
Note how the IN-HARDIRQ-W talks about a SCSI host lock of a libata host,
whereas the locking in the firewire stack surely does only interact
(hopefully correctly) with SCSI host locks of firewire-sbp2 host
instances.
What does this all mean? How to act on it?
=========================================================
[ INFO: possible irq lock inversion dependency detected ]
2.6.35 #1
---------------------------------------------------------
swapper/0 just changed the state of lock:
(&t->split_timeout_timer){+.-...}, at: [<c1032228>] run_timer_softirq+0x112/0x21c
but this lock was taken by another, HARDIRQ-safe lock in the past:
(&(shost->host_lock)->rlock){-.-...}
and interrupts could create inverse lock ordering between them.
other info that might help us debug this:
no locks held by swapper/0.
the shortest dependencies between 2nd lock and 1st lock:
-> (&(shost->host_lock)->rlock){-.-...} ops: 123308 {
IN-HARDIRQ-W at:
[<c104a080>] __lock_acquire+0x612/0x153e
[<c104b008>] lock_acquire+0x5c/0x73
[<c1268b57>] _raw_spin_lock_irqsave+0x2e/0x3e
[<c11ad508>] scsi_eh_scmd_add+0x25/0x8a
[<c11ad5b7>] scsi_times_out+0x4a/0x64
[<c10e9fca>] blk_rq_timed_out+0xf/0x4b
[<c10ea044>] blk_abort_request+0x3e/0x41
[<c11c43c0>] ata_qc_schedule_eh+0x55/0x67
[<c11b9c4e>] ata_qc_complete+0x59/0x1b6
[<c11c6751>] ata_hsm_qc_complete+0xb5/0xcf
[<c11c6e18>] ata_sff_hsm_move+0x6ad/0x6fa
[<c11c7002>] __ata_sff_port_intr+0x9f/0xad
[<c11c70c9>] ata_bmdma_port_intr+0xb9/0x127
[<c11c71c3>] ata_bmdma_interrupt+0x8c/0x170
[<c1059b58>] handle_IRQ_event+0x1d/0xa2
[<c105b45f>] handle_fasteoi_irq+0x79/0xb2
[<c10045bb>] handle_irq+0x3b/0x48
[<c1003df9>] do_IRQ+0x45/0x9f
[<c1002d2e>] common_interrupt+0x2e/0x34
[<c11f4500>] cpuidle_idle_call+0x6d/0xa1
[<c1001a72>] cpu_idle+0x49/0x77
[<c12594f7>] rest_init+0xbf/0xc4
[<c13c98c9>] start_kernel+0x306/0x30b
[<c13c90b7>] i386_start_kernel+0xb7/0xbf
IN-SOFTIRQ-W at:
[<c104a0a1>] __lock_acquire+0x633/0x153e
[<c104b008>] lock_acquire+0x5c/0x73
[<c1268b57>] _raw_spin_lock_irqsave+0x2e/0x3e
[<c11aec72>] scsi_device_unbusy+0x1e/0x8d
[<c11a92fa>] scsi_finish_command+0x22/0xdf
[<c11af291>] scsi_softirq_done+0xf4/0xfc
[<c10e9d9c>] blk_done_softirq+0x5c/0x69
[<c102defb>] __do_softirq+0x8b/0x10a
[<c102dfa5>] do_softirq+0x2b/0x43
[<c102e031>] run_ksoftirqd+0x74/0x15a
[<c103ae19>] kthread+0x61/0x66
[<c1002d3a>] kernel_thread_helper+0x6/0x1a
INITIAL USE at:
[<c104a17b>] __lock_acquire+0x70d/0x153e
[<c104b008>] lock_acquire+0x5c/0x73
[<c1268b57>] _raw_spin_lock_irqsave+0x2e/0x3e
[<c11ad5e0>] scsi_schedule_eh+0xf/0x4f
[<c11c4125>] ata_port_schedule_eh+0x37/0x3a
[<c11bd629>] async_port_probe+0x65/0x9c
[<c1040051>] async_thread+0xe7/0x1c8
[<c103ae19>] kthread+0x61/0x66
[<c1002d3a>] kernel_thread_helper+0x6/0x1a
}
... key at: [<c19b0ab4>] __key.30680+0x0/0x8
... acquired at:
[<c104b008>] lock_acquire+0x5c/0x73
[<c1032a3f>] del_timer_sync+0x29/0x6c
[<f84a6ea0>] close_transaction+0x93/0xb4 [firewire_core]
[<f84a6f00>] transmit_complete_callback+0x3f/0x43 [firewire_core]
[<f8da60ff>] at_context_transmit+0x59b/0x5a3 [firewire_ohci]
[<f8da6123>] ohci_send_request+0xd/0xf [firewire_ohci]
[<f84a7ce4>] fw_send_request+0x21c/0x224 [firewire_core]
[<f8dab307>] 0xf8dab307
[<f8dac075>] 0xf8dac075
[<c11a962a>] scsi_dispatch_cmd+0x182/0x1f2
[<c11ae4ed>] scsi_request_fn+0x359/0x492
[<c10e644f>] __generic_unplug_device+0x26/0x29
[<c10e9306>] blk_execute_rq_nowait+0x56/0x75
[<c10e93cd>] blk_execute_rq+0xa8/0xc7
[<c11af359>] scsi_execute+0xc0/0x105
[<c11af3f3>] scsi_execute_req+0x55/0x7d
[<c11b3d9a>] sd_start_stop_device+0x7b/0x117
[<c11b41a7>] sd_shutdown+0x108/0x117
[<c11b42d0>] sd_remove+0x42/0x6f
[<c11975af>] __device_release_driver+0x5e/0x96
[<c119768d>] device_release_driver+0x1a/0x25
[<c1196dcc>] bus_remove_device+0x7b/0x92
[<c119592a>] device_del+0xfa/0x154
[<c11b21ae>] __scsi_remove_device+0x3d/0x7b
[<c11b2208>] scsi_remove_device+0x1c/0x27
[<f8dab83e>] 0xf8dab83e
[<c10f2b89>] kref_put+0x39/0x44
[<f8dab0ff>] 0xf8dab0ff
[<c11975af>] __device_release_driver+0x5e/0x96
[<c119768d>] device_release_driver+0x1a/0x25
[<c1196dcc>] bus_remove_device+0x7b/0x92
[<c119592a>] device_del+0xfa/0x154
[<c119598f>] device_unregister+0xb/0x15
[<f84a4eb8>] shutdown_unit+0x8/0xc [firewire_core]
[<c1195270>] device_for_each_child+0x2a/0x54
[<f84a4f2e>] fw_device_shutdown+0x72/0xa4 [firewire_core]
[<c1037f34>] worker_thread+0x16a/0x242
[<c103ae19>] kthread+0x61/0x66
[<c1002d3a>] kernel_thread_helper+0x6/0x1a
-> (&t->split_timeout_timer){+.-...} ops: 14281 {
HARDIRQ-ON-W at:
[<c104a100>] __lock_acquire+0x692/0x153e
[<c104b008>] lock_acquire+0x5c/0x73
[<c1032291>] run_timer_softirq+0x17b/0x21c
[<c102defb>] __do_softirq+0x8b/0x10a
[<c102dfa5>] do_softirq+0x2b/0x43
[<c102e14f>] irq_exit+0x38/0x74
[<c1003e3d>] do_IRQ+0x89/0x9f
[<c1002d2e>] common_interrupt+0x2e/0x34
[<c11f4500>] cpuidle_idle_call+0x6d/0xa1
[<c1001a72>] cpu_idle+0x49/0x77
[<c12594f7>] rest_init+0xbf/0xc4
[<c13c98c9>] start_kernel+0x306/0x30b
[<c13c90b7>] i386_start_kernel+0xb7/0xbf
IN-SOFTIRQ-W at:
[<c104a0a1>] __lock_acquire+0x633/0x153e
[<c104b008>] lock_acquire+0x5c/0x73
[<c1032a3f>] del_timer_sync+0x29/0x6c
[<f84a6ddc>] fw_core_handle_response+0x108/0x139 [firewire_core]
[<f8da5b58>] handle_ar_packet+0x108/0x114 [firewire_ohci]
[<f8da6236>] ar_context_tasklet+0x111/0x11f [firewire_ohci]
[<c102da6d>] tasklet_action+0x8d/0xe0
[<c102defb>] __do_softirq+0x8b/0x10a
[<c102dfa5>] do_softirq+0x2b/0x43
[<c102e14f>] irq_exit+0x38/0x74
[<c1003e3d>] do_IRQ+0x89/0x9f
[<c1002d2e>] common_interrupt+0x2e/0x34
[<c11f4500>] cpuidle_idle_call+0x6d/0xa1
[<c1001a72>] cpu_idle+0x49/0x77
[<c12594f7>] rest_init+0xbf/0xc4
[<c13c98c9>] start_kernel+0x306/0x30b
[<c13c90b7>] i386_start_kernel+0xb7/0xbf
INITIAL USE at:
[<c104a17b>] __lock_acquire+0x70d/0x153e
[<c104b008>] lock_acquire+0x5c/0x73
[<c1032a3f>] del_timer_sync+0x29/0x6c
[<f84a6ddc>] fw_core_handle_response+0x108/0x139 [firewire_core]
[<f8da5de2>] at_context_transmit+0x27e/0x5a3 [firewire_ohci]
[<f8da6123>] ohci_send_request+0xd/0xf [firewire_ohci]
[<f84a7ce4>] fw_send_request+0x21c/0x224 [firewire_core]
[<f84a7d71>] fw_run_transaction+0x85/0xc5 [firewire_core]
[<f84a2485>] bm_work+0x16c/0x34b [firewire_core]
[<c1037f34>] worker_thread+0x16a/0x242
[<c103ae19>] kthread+0x61/0x66
[<c1002d3a>] kernel_thread_helper+0x6/0x1a
}
... key at: [<f84a97d0>] __key.22471+0x0/0xffffe611 [firewire_core]
... acquired at:
[<c10494ba>] check_usage_backwards+0x94/0x9f
[<c1048b60>] mark_lock+0x2c0/0x4cd
[<c104a100>] __lock_acquire+0x692/0x153e
[<c104b008>] lock_acquire+0x5c/0x73
[<c1032291>] run_timer_softirq+0x17b/0x21c
[<c102defb>] __do_softirq+0x8b/0x10a
[<c102dfa5>] do_softirq+0x2b/0x43
[<c102e14f>] irq_exit+0x38/0x74
[<c1003e3d>] do_IRQ+0x89/0x9f
[<c1002d2e>] common_interrupt+0x2e/0x34
[<c11f4500>] cpuidle_idle_call+0x6d/0xa1
[<c1001a72>] cpu_idle+0x49/0x77
[<c12594f7>] rest_init+0xbf/0xc4
[<c13c98c9>] start_kernel+0x306/0x30b
[<c13c90b7>] i386_start_kernel+0xb7/0xbf
stack backtrace:
Pid: 0, comm: swapper Not tainted 2.6.35 #1
Call Trace:
[<c1266701>] ? printk+0xf/0x16
[<c104937c>] print_irq_inversion_bug+0xeb/0xf6
[<c10494ba>] check_usage_backwards+0x94/0x9f
[<c1048b60>] mark_lock+0x2c0/0x4cd
[<c1049426>] ? check_usage_backwards+0x0/0x9f
[<c104a100>] __lock_acquire+0x692/0x153e
[<c126910a>] ? _raw_spin_unlock_irqrestore+0x54/0x58
[<c10fbc1f>] ? debug_object_deactivate+0xa1/0xcc
[<c104b008>] lock_acquire+0x5c/0x73
[<c1032228>] ? run_timer_softirq+0x112/0x21c
[<c1032291>] run_timer_softirq+0x17b/0x21c
[<c1032228>] ? run_timer_softirq+0x112/0x21c
[<f84a77ce>] ? split_transaction_timeout_callback+0x0/0x8b [firewire_core]
[<c102defb>] __do_softirq+0x8b/0x10a
[<c102dfa5>] do_softirq+0x2b/0x43
[<c102e14f>] irq_exit+0x38/0x74
[<c1003e3d>] do_IRQ+0x89/0x9f
[<c1002d2e>] common_interrupt+0x2e/0x34
[<c1140f6d>] ? acpi_idle_enter_simple+0x122/0x14e
[<c11f4500>] cpuidle_idle_call+0x6d/0xa1
[<c1001a72>] cpu_idle+0x49/0x77
[<c12594f7>] rest_init+0xbf/0xc4
[<c13c98c9>] start_kernel+0x306/0x30b
[<c13c90b7>] i386_start_kernel+0xb7/0xbf
--
Stefan Richter
-=====-==-=- =--- =----
http://arcgraph.de/sr/
^ permalink raw reply [flat|nested] 17+ messages in thread* Re: lockdep false positive? -- firewire-core transaction timer vs. scsi-core host lock 2010-08-16 18:42 lockdep false positive? -- firewire-core transaction timer vs. scsi-core host lock Stefan Richter @ 2010-08-16 21:42 ` Peter Zijlstra 2010-08-16 22:09 ` Johannes Berg 2010-08-16 22:27 ` Johannes Berg 2010-08-17 14:35 ` Yong Zhang 1 sibling, 2 replies; 17+ messages in thread From: Peter Zijlstra @ 2010-08-16 21:42 UTC (permalink / raw) To: Stefan Richter; +Cc: linux1394-devel, linux-kernel, Johannes Berg On Mon, 2010-08-16 at 20:42 +0200, Stefan Richter wrote: > What does this all mean? How to act on it? > > ========================================================= > [ INFO: possible irq lock inversion dependency detected ] > 2.6.35 #1 > --------------------------------------------------------- > swapper/0 just changed the state of lock: > (&t->split_timeout_timer){+.-...}, at: [<c1032228>] run_timer_softirq+0x112/0x21c > but this lock was taken by another, HARDIRQ-safe lock in the past: > (&(shost->host_lock)->rlock){-.-...} > > and interrupts could create inverse lock ordering between them. It says the locks nest like: (&(shost->host_lock)->rlock){-.-...} (&t->split_timeout_timer){+.-...} But: > -> (&(shost->host_lock)->rlock){-.-...} ops: 123308 { > IN-HARDIRQ-W at: > [<c104a080>] __lock_acquire+0x612/0x153e > [<c104b008>] lock_acquire+0x5c/0x73 > [<c1268b57>] _raw_spin_lock_irqsave+0x2e/0x3e > [<c11ad508>] scsi_eh_scmd_add+0x25/0x8a ... vs > -> (&t->split_timeout_timer){+.-...} ops: 14281 { > HARDIRQ-ON-W at: > [<c104a100>] __lock_acquire+0x692/0x153e > [<c104b008>] lock_acquire+0x5c/0x73 > [<c1032291>] run_timer_softirq+0x17b/0x21c > [<c102defb>] __do_softirq+0x8b/0x10a > [<c102dfa5>] do_softirq+0x2b/0x43 ... So shost->host_lock->rlock is a lock used in hardirq context but split_timeout_timer is a !irq-safe lock. Which means that it now worries the following can happen: softirq: spin_lock(&t->split_timeout_timer); IRQ: spin_lock(&(shost->host_lock)->rlock); spin_lock(&t->split_timeout_timer); Now, the thing is that split_timeout_timer is a fake lock used to annotate timers, its use is to connect lock chains from within the timer callback to del_timer_sync() callers, to detect deadlocks. Now, I can't seem to remember why del_timer_sync() explicitly disables IRQs but call_timer_fn() does not, Johill, happen to remember? ^ permalink raw reply [flat|nested] 17+ messages in thread
* Re: lockdep false positive? -- firewire-core transaction timer vs. scsi-core host lock 2010-08-16 21:42 ` Peter Zijlstra @ 2010-08-16 22:09 ` Johannes Berg 2010-08-16 22:27 ` Johannes Berg 1 sibling, 0 replies; 17+ messages in thread From: Johannes Berg @ 2010-08-16 22:09 UTC (permalink / raw) To: Peter Zijlstra; +Cc: Stefan Richter, linux1394-devel, linux-kernel On Mon, 2010-08-16 at 23:42 +0200, Peter Zijlstra wrote: > Which means that it now worries the following can happen: > > softirq: > spin_lock(&t->split_timeout_timer); > > IRQ: > spin_lock(&(shost->host_lock)->rlock); > spin_lock(&t->split_timeout_timer); > > Now, the thing is that split_timeout_timer is a fake lock used to > annotate timers, its use is to connect lock chains from within the timer > callback to del_timer_sync() callers, to detect deadlocks. > > Now, I can't seem to remember why del_timer_sync() explicitly disables > IRQs but call_timer_fn() does not, Johill, happen to remember? Err, sorry, no. I do remember that we had long discussions and initially I had wanted to disable the context checking for these fake locks completely. I think it was just the minimal thing needed not to make it warn always. Also, we can't do the same thing in call_timer_fn() since we need to hold it across fn()? Wouldn't it complain if we held that lock then while enabling IRQs again? Not sure. I don't fully understand the scenario above yet though. Why does it think we could ever take the fake lock in an IRQ to start with? That must not happen. Or is that just preemptive worrying? johannes ^ permalink raw reply [flat|nested] 17+ messages in thread
* Re: lockdep false positive? -- firewire-core transaction timer vs. scsi-core host lock 2010-08-16 21:42 ` Peter Zijlstra 2010-08-16 22:09 ` Johannes Berg @ 2010-08-16 22:27 ` Johannes Berg 2010-08-17 16:22 ` Stefan Richter 1 sibling, 1 reply; 17+ messages in thread From: Johannes Berg @ 2010-08-16 22:27 UTC (permalink / raw) To: Peter Zijlstra; +Cc: Stefan Richter, linux1394-devel, linux-kernel On Mon, 2010-08-16 at 23:42 +0200, Peter Zijlstra wrote: > softirq: > spin_lock(&t->split_timeout_timer); > > IRQ: > spin_lock(&(shost->host_lock)->rlock); > spin_lock(&t->split_timeout_timer); Actually, I think it's worried that you could have CPU 0 CPU 1 softirq: softirq: spin_lock(timer) spin_lock(rlock) irq: spin_lock(rlock) spin_lock(timer) [still in softirq] because it has previously seen the nesting that I put on CPU 1, and now found that the timer "lock" is used with interrupts enabled. At least that's a scenario I could understand? I'm convinced it's a false positive though, question is how to shut it up :-) johannes ^ permalink raw reply [flat|nested] 17+ messages in thread
* Re: lockdep false positive? -- firewire-core transaction timer vs. scsi-core host lock 2010-08-16 22:27 ` Johannes Berg @ 2010-08-17 16:22 ` Stefan Richter 2010-08-18 13:17 ` Stefan Richter 0 siblings, 1 reply; 17+ messages in thread From: Stefan Richter @ 2010-08-17 16:22 UTC (permalink / raw) To: Johannes Berg; +Cc: Peter Zijlstra, linux1394-devel, linux-kernel Johannes Berg wrote: > I'm convinced it's a false positive though, question is how to shut it > up :-) Is there anyway to differentiate the lockdep naming of shost->host_lock (initialized in scsi_host_alloc) at runtime by means of the scsi_host_template instance given to scsi_host_alloc? -- Stefan Richter -=====-==-=- =--- =---= http://arcgraph.de/sr/ ^ permalink raw reply [flat|nested] 17+ messages in thread
* Re: lockdep false positive? -- firewire-core transaction timer vs. scsi-core host lock 2010-08-17 16:22 ` Stefan Richter @ 2010-08-18 13:17 ` Stefan Richter 0 siblings, 0 replies; 17+ messages in thread From: Stefan Richter @ 2010-08-18 13:17 UTC (permalink / raw) To: Johannes Berg Cc: Peter Zijlstra, linux1394-devel, linux-kernel, Clemens Ladisch Stefan Richter wrote: > Johannes Berg wrote: >> I'm convinced it's a false positive though, question is how to shut it >> up :-) > > Is there anyway to differentiate the lockdep naming of shost->host_lock > (initialized in scsi_host_alloc) at runtime by means of the > scsi_host_template instance given to scsi_host_alloc? So, even though the part of the lockdep report about nesting with a SCSI host lock was a false positive, it did motivate Clemens' del_timer_sync -> del_timer replacement, so it was helpful after all. I guess there is no immediate need anymore to do something about the conflation of shost->host_lock instances. -- Stefan Richter -=====-==-=- =--- =--=- http://arcgraph.de/sr/ ^ permalink raw reply [flat|nested] 17+ messages in thread
* Re: lockdep false positive? -- firewire-core transaction timer vs. scsi-core host lock 2010-08-16 18:42 lockdep false positive? -- firewire-core transaction timer vs. scsi-core host lock Stefan Richter 2010-08-16 21:42 ` Peter Zijlstra @ 2010-08-17 14:35 ` Yong Zhang 2010-08-17 16:23 ` Stefan Richter 1 sibling, 1 reply; 17+ messages in thread From: Yong Zhang @ 2010-08-17 14:35 UTC (permalink / raw) To: Stefan Richter; +Cc: linux1394-devel, linux-kernel, Thomas Gleixner On Mon, Aug 16, 2010 at 08:42:34PM +0200, Stefan Richter wrote: > Hi, I just saw the following (twice, with a reboot in between) with > kernel 2.6.35 plus firewire driver patched to their state in 2.6.36-rc1, > i.e. basically with a kernel according to commit e78483c5. > > I suspect it might also happen without firewire patches, as 2.6.35 > already contains the change where firewire-core's per-transaction timer > was introduced which shows up in the following log. Cc'ing Thomas. I suspect it's introduced by commit 5c40cbfefa828208c671e2f58789e4dd04f79563 which call del_timer_sync() in softirq. comments on del_timer_sync() say "It must not be called from interrupt contexts." Thanks, Yong > > It happened while I was testing hotplugging of FireWire disks with > several ones on a bus together with several FireWire 800 repeaters, > which sometimes act up so that reconnect does not succeed immediately. > Perhaps such reconnection troubles are necessary to provoke this lockdep > report. > > [I only accidentally discovered this on a test PC which do not use very > frequently. Unfortunately, I had lockdep disabled on my main PC on > which I test most of my FireWire hardware for quite a while now. I > guess I should enable lockdep there again, but I have a suspicion that I > won't leave it on for too long.] > > Note how the IN-HARDIRQ-W talks about a SCSI host lock of a libata host, > whereas the locking in the firewire stack surely does only interact > (hopefully correctly) with SCSI host locks of firewire-sbp2 host > instances. > > What does this all mean? How to act on it? > > ========================================================= > [ INFO: possible irq lock inversion dependency detected ] > 2.6.35 #1 > --------------------------------------------------------- > swapper/0 just changed the state of lock: > (&t->split_timeout_timer){+.-...}, at: [<c1032228>] run_timer_softirq+0x112/0x21c > but this lock was taken by another, HARDIRQ-safe lock in the past: > (&(shost->host_lock)->rlock){-.-...} > > and interrupts could create inverse lock ordering between them. > > > other info that might help us debug this: > no locks held by swapper/0. > > the shortest dependencies between 2nd lock and 1st lock: > -> (&(shost->host_lock)->rlock){-.-...} ops: 123308 { > IN-HARDIRQ-W at: > [<c104a080>] __lock_acquire+0x612/0x153e > [<c104b008>] lock_acquire+0x5c/0x73 > [<c1268b57>] _raw_spin_lock_irqsave+0x2e/0x3e > [<c11ad508>] scsi_eh_scmd_add+0x25/0x8a > [<c11ad5b7>] scsi_times_out+0x4a/0x64 > [<c10e9fca>] blk_rq_timed_out+0xf/0x4b > [<c10ea044>] blk_abort_request+0x3e/0x41 > [<c11c43c0>] ata_qc_schedule_eh+0x55/0x67 > [<c11b9c4e>] ata_qc_complete+0x59/0x1b6 > [<c11c6751>] ata_hsm_qc_complete+0xb5/0xcf > [<c11c6e18>] ata_sff_hsm_move+0x6ad/0x6fa > [<c11c7002>] __ata_sff_port_intr+0x9f/0xad > [<c11c70c9>] ata_bmdma_port_intr+0xb9/0x127 > [<c11c71c3>] ata_bmdma_interrupt+0x8c/0x170 > [<c1059b58>] handle_IRQ_event+0x1d/0xa2 > [<c105b45f>] handle_fasteoi_irq+0x79/0xb2 > [<c10045bb>] handle_irq+0x3b/0x48 > [<c1003df9>] do_IRQ+0x45/0x9f > [<c1002d2e>] common_interrupt+0x2e/0x34 > [<c11f4500>] cpuidle_idle_call+0x6d/0xa1 > [<c1001a72>] cpu_idle+0x49/0x77 > [<c12594f7>] rest_init+0xbf/0xc4 > [<c13c98c9>] start_kernel+0x306/0x30b > [<c13c90b7>] i386_start_kernel+0xb7/0xbf > IN-SOFTIRQ-W at: > [<c104a0a1>] __lock_acquire+0x633/0x153e > [<c104b008>] lock_acquire+0x5c/0x73 > [<c1268b57>] _raw_spin_lock_irqsave+0x2e/0x3e > [<c11aec72>] scsi_device_unbusy+0x1e/0x8d > [<c11a92fa>] scsi_finish_command+0x22/0xdf > [<c11af291>] scsi_softirq_done+0xf4/0xfc > [<c10e9d9c>] blk_done_softirq+0x5c/0x69 > [<c102defb>] __do_softirq+0x8b/0x10a > [<c102dfa5>] do_softirq+0x2b/0x43 > [<c102e031>] run_ksoftirqd+0x74/0x15a > [<c103ae19>] kthread+0x61/0x66 > [<c1002d3a>] kernel_thread_helper+0x6/0x1a > INITIAL USE at: > [<c104a17b>] __lock_acquire+0x70d/0x153e > [<c104b008>] lock_acquire+0x5c/0x73 > [<c1268b57>] _raw_spin_lock_irqsave+0x2e/0x3e > [<c11ad5e0>] scsi_schedule_eh+0xf/0x4f > [<c11c4125>] ata_port_schedule_eh+0x37/0x3a > [<c11bd629>] async_port_probe+0x65/0x9c > [<c1040051>] async_thread+0xe7/0x1c8 > [<c103ae19>] kthread+0x61/0x66 > [<c1002d3a>] kernel_thread_helper+0x6/0x1a > } > ... key at: [<c19b0ab4>] __key.30680+0x0/0x8 > ... acquired at: > [<c104b008>] lock_acquire+0x5c/0x73 > [<c1032a3f>] del_timer_sync+0x29/0x6c > [<f84a6ea0>] close_transaction+0x93/0xb4 [firewire_core] > [<f84a6f00>] transmit_complete_callback+0x3f/0x43 [firewire_core] > [<f8da60ff>] at_context_transmit+0x59b/0x5a3 [firewire_ohci] > [<f8da6123>] ohci_send_request+0xd/0xf [firewire_ohci] > [<f84a7ce4>] fw_send_request+0x21c/0x224 [firewire_core] > [<f8dab307>] 0xf8dab307 > [<f8dac075>] 0xf8dac075 > [<c11a962a>] scsi_dispatch_cmd+0x182/0x1f2 > [<c11ae4ed>] scsi_request_fn+0x359/0x492 > [<c10e644f>] __generic_unplug_device+0x26/0x29 > [<c10e9306>] blk_execute_rq_nowait+0x56/0x75 > [<c10e93cd>] blk_execute_rq+0xa8/0xc7 > [<c11af359>] scsi_execute+0xc0/0x105 > [<c11af3f3>] scsi_execute_req+0x55/0x7d > [<c11b3d9a>] sd_start_stop_device+0x7b/0x117 > [<c11b41a7>] sd_shutdown+0x108/0x117 > [<c11b42d0>] sd_remove+0x42/0x6f > [<c11975af>] __device_release_driver+0x5e/0x96 > [<c119768d>] device_release_driver+0x1a/0x25 > [<c1196dcc>] bus_remove_device+0x7b/0x92 > [<c119592a>] device_del+0xfa/0x154 > [<c11b21ae>] __scsi_remove_device+0x3d/0x7b > [<c11b2208>] scsi_remove_device+0x1c/0x27 > [<f8dab83e>] 0xf8dab83e > [<c10f2b89>] kref_put+0x39/0x44 > [<f8dab0ff>] 0xf8dab0ff > [<c11975af>] __device_release_driver+0x5e/0x96 > [<c119768d>] device_release_driver+0x1a/0x25 > [<c1196dcc>] bus_remove_device+0x7b/0x92 > [<c119592a>] device_del+0xfa/0x154 > [<c119598f>] device_unregister+0xb/0x15 > [<f84a4eb8>] shutdown_unit+0x8/0xc [firewire_core] > [<c1195270>] device_for_each_child+0x2a/0x54 > [<f84a4f2e>] fw_device_shutdown+0x72/0xa4 [firewire_core] > [<c1037f34>] worker_thread+0x16a/0x242 > [<c103ae19>] kthread+0x61/0x66 > [<c1002d3a>] kernel_thread_helper+0x6/0x1a > > -> (&t->split_timeout_timer){+.-...} ops: 14281 { > HARDIRQ-ON-W at: > [<c104a100>] __lock_acquire+0x692/0x153e > [<c104b008>] lock_acquire+0x5c/0x73 > [<c1032291>] run_timer_softirq+0x17b/0x21c > [<c102defb>] __do_softirq+0x8b/0x10a > [<c102dfa5>] do_softirq+0x2b/0x43 > [<c102e14f>] irq_exit+0x38/0x74 > [<c1003e3d>] do_IRQ+0x89/0x9f > [<c1002d2e>] common_interrupt+0x2e/0x34 > [<c11f4500>] cpuidle_idle_call+0x6d/0xa1 > [<c1001a72>] cpu_idle+0x49/0x77 > [<c12594f7>] rest_init+0xbf/0xc4 > [<c13c98c9>] start_kernel+0x306/0x30b > [<c13c90b7>] i386_start_kernel+0xb7/0xbf > IN-SOFTIRQ-W at: > [<c104a0a1>] __lock_acquire+0x633/0x153e > [<c104b008>] lock_acquire+0x5c/0x73 > [<c1032a3f>] del_timer_sync+0x29/0x6c > [<f84a6ddc>] fw_core_handle_response+0x108/0x139 [firewire_core] > [<f8da5b58>] handle_ar_packet+0x108/0x114 [firewire_ohci] > [<f8da6236>] ar_context_tasklet+0x111/0x11f [firewire_ohci] > [<c102da6d>] tasklet_action+0x8d/0xe0 > [<c102defb>] __do_softirq+0x8b/0x10a > [<c102dfa5>] do_softirq+0x2b/0x43 > [<c102e14f>] irq_exit+0x38/0x74 > [<c1003e3d>] do_IRQ+0x89/0x9f > [<c1002d2e>] common_interrupt+0x2e/0x34 > [<c11f4500>] cpuidle_idle_call+0x6d/0xa1 > [<c1001a72>] cpu_idle+0x49/0x77 > [<c12594f7>] rest_init+0xbf/0xc4 > [<c13c98c9>] start_kernel+0x306/0x30b > [<c13c90b7>] i386_start_kernel+0xb7/0xbf > INITIAL USE at: > [<c104a17b>] __lock_acquire+0x70d/0x153e > [<c104b008>] lock_acquire+0x5c/0x73 > [<c1032a3f>] del_timer_sync+0x29/0x6c > [<f84a6ddc>] fw_core_handle_response+0x108/0x139 [firewire_core] > [<f8da5de2>] at_context_transmit+0x27e/0x5a3 [firewire_ohci] > [<f8da6123>] ohci_send_request+0xd/0xf [firewire_ohci] > [<f84a7ce4>] fw_send_request+0x21c/0x224 [firewire_core] > [<f84a7d71>] fw_run_transaction+0x85/0xc5 [firewire_core] > [<f84a2485>] bm_work+0x16c/0x34b [firewire_core] > [<c1037f34>] worker_thread+0x16a/0x242 > [<c103ae19>] kthread+0x61/0x66 > [<c1002d3a>] kernel_thread_helper+0x6/0x1a > } > ... key at: [<f84a97d0>] __key.22471+0x0/0xffffe611 [firewire_core] > ... acquired at: > [<c10494ba>] check_usage_backwards+0x94/0x9f > [<c1048b60>] mark_lock+0x2c0/0x4cd > [<c104a100>] __lock_acquire+0x692/0x153e > [<c104b008>] lock_acquire+0x5c/0x73 > [<c1032291>] run_timer_softirq+0x17b/0x21c > [<c102defb>] __do_softirq+0x8b/0x10a > [<c102dfa5>] do_softirq+0x2b/0x43 > [<c102e14f>] irq_exit+0x38/0x74 > [<c1003e3d>] do_IRQ+0x89/0x9f > [<c1002d2e>] common_interrupt+0x2e/0x34 > [<c11f4500>] cpuidle_idle_call+0x6d/0xa1 > [<c1001a72>] cpu_idle+0x49/0x77 > [<c12594f7>] rest_init+0xbf/0xc4 > [<c13c98c9>] start_kernel+0x306/0x30b > [<c13c90b7>] i386_start_kernel+0xb7/0xbf > > > stack backtrace: > Pid: 0, comm: swapper Not tainted 2.6.35 #1 > Call Trace: > [<c1266701>] ? printk+0xf/0x16 > [<c104937c>] print_irq_inversion_bug+0xeb/0xf6 > [<c10494ba>] check_usage_backwards+0x94/0x9f > [<c1048b60>] mark_lock+0x2c0/0x4cd > [<c1049426>] ? check_usage_backwards+0x0/0x9f > [<c104a100>] __lock_acquire+0x692/0x153e > [<c126910a>] ? _raw_spin_unlock_irqrestore+0x54/0x58 > [<c10fbc1f>] ? debug_object_deactivate+0xa1/0xcc > [<c104b008>] lock_acquire+0x5c/0x73 > [<c1032228>] ? run_timer_softirq+0x112/0x21c > [<c1032291>] run_timer_softirq+0x17b/0x21c > [<c1032228>] ? run_timer_softirq+0x112/0x21c > [<f84a77ce>] ? split_transaction_timeout_callback+0x0/0x8b [firewire_core] > [<c102defb>] __do_softirq+0x8b/0x10a > [<c102dfa5>] do_softirq+0x2b/0x43 > [<c102e14f>] irq_exit+0x38/0x74 > [<c1003e3d>] do_IRQ+0x89/0x9f > [<c1002d2e>] common_interrupt+0x2e/0x34 > [<c1140f6d>] ? acpi_idle_enter_simple+0x122/0x14e > [<c11f4500>] cpuidle_idle_call+0x6d/0xa1 > [<c1001a72>] cpu_idle+0x49/0x77 > [<c12594f7>] rest_init+0xbf/0xc4 > [<c13c98c9>] start_kernel+0x306/0x30b > [<c13c90b7>] i386_start_kernel+0xb7/0xbf > > > -- > Stefan Richter > -=====-==-=- =--- =---- > http://arcgraph.de/sr/ > > -- > To unsubscribe from this list: send the line "unsubscribe linux-kernel" in > the body of a message to majordomo@vger.kernel.org > More majordomo info at http://vger.kernel.org/majordomo-info.html > Please read the FAQ at http://www.tux.org/lkml/ ^ permalink raw reply [flat|nested] 17+ messages in thread
* Re: lockdep false positive? -- firewire-core transaction timer vs. scsi-core host lock 2010-08-17 14:35 ` Yong Zhang @ 2010-08-17 16:23 ` Stefan Richter 2010-08-18 7:01 ` Clemens Ladisch 0 siblings, 1 reply; 17+ messages in thread From: Stefan Richter @ 2010-08-17 16:23 UTC (permalink / raw) To: Yong Zhang Cc: linux1394-devel, linux-kernel, Thomas Gleixner, Peter Zijlstra, Johannes Berg Yong Zhang wrote: > I suspect it's introduced by commit 5c40cbfefa828208c671e2f58789e4dd04f79563 > which call del_timer_sync() in softirq. > > comments on del_timer_sync() say "It must not be called from interrupt contexts." I hope the del_timer_sync kerneldoc comment is about hardIRQ context, *otherwise* commit 5c40cbfe is defective indeed. The firewire stack calls del_timer_sync in process or softIRQ context (firewire-ohci tasklets, block layer soft IRQ, networking soft IRQ...) but never from hardIRQ. -- Stefan Richter -=====-==-=- =--- =---= http://arcgraph.de/sr/ ^ permalink raw reply [flat|nested] 17+ messages in thread
* Re: lockdep false positive? -- firewire-core transaction timer vs. scsi-core host lock 2010-08-17 16:23 ` Stefan Richter @ 2010-08-18 7:01 ` Clemens Ladisch 2010-08-18 8:09 ` Stefan Richter 2010-08-18 8:59 ` Peter Zijlstra 0 siblings, 2 replies; 17+ messages in thread From: Clemens Ladisch @ 2010-08-18 7:01 UTC (permalink / raw) To: Stefan Richter Cc: Yong Zhang, Peter Zijlstra, Johannes Berg, Thomas Gleixner, linux1394-devel, linux-kernel Stefan Richter wrote: > Yong Zhang wrote: > > I suspect it's introduced by commit 5c40cbfefa828208c671e2f58789e4dd04f79563 > > which call del_timer_sync() in softirq. > > > > comments on del_timer_sync() say "It must not be called from interrupt contexts." > > I hope the del_timer_sync kerneldoc comment is about hardIRQ context, Then both the comment and its lockdep code would be wrong. > *otherwise* commit 5c40cbfe is defective indeed. --8<---------------------------------------------------------------->8-- firewire: core: do not use del_timer_sync() in interrupt context Because we might be in interrupt context, replace del_timer_sync() with del_timer(). If the timer was already running when we tried to delete it, explicitly wait for it to finish to ensure that it does not access the transaction data after the normal completion code might have freed it. Many thanks to Yong Zhang for diagnosing this and to Rusty Russell for his Locking Guide. Reported-by: Stefan Richter <stefanr@s5r6.in-berlin.de> Signed-off-by: Clemens Ladisch <clemens@ladisch.de> --- drivers/firewire/core-transaction.c | 17 ++++++++++++++--- 1 file changed, 14 insertions(+), 3 deletions(-) --- a/drivers/firewire/core-transaction.c +++ b/drivers/firewire/core-transaction.c @@ -78,9 +78,16 @@ static int close_transaction(struct fw_t struct fw_transaction *t; unsigned long flags; +retry: spin_lock_irqsave(&card->lock, flags); list_for_each_entry(t, &card->transaction_list, link) { if (t == transaction) { + if (!del_timer(&t->split_timeout_timer)) { + /* wait for the timer to cancel it */ + spin_unlock_irqrestore(&card->lock, flags); + cpu_relax(); + goto retry; + } list_del_init(&t->link); card->tlabel_mask &= ~(1ULL << t->tlabel); break; @@ -89,7 +96,6 @@ static int close_transaction(struct fw_t spin_unlock_irqrestore(&card->lock, flags); if (&t->link != &card->transaction_list) { - del_timer_sync(&t->split_timeout_timer); t->callback(card, rcode, NULL, 0, t->callback_data); return 0; } @@ -918,9 +924,16 @@ void fw_core_handle_response(struct fw_c source = HEADER_GET_SOURCE(p->header[1]); rcode = HEADER_GET_RCODE(p->header[1]); +retry: spin_lock_irqsave(&card->lock, flags); list_for_each_entry(t, &card->transaction_list, link) { if (t->node_id == source && t->tlabel == tlabel) { + if (!del_timer(&t->split_timeout_timer)) { + /* wait for the timer to cancel it */ + spin_unlock_irqrestore(&card->lock, flags); + cpu_relax(); + goto retry; + } list_del_init(&t->link); card->tlabel_mask &= ~(1ULL << t->tlabel); break; @@ -963,8 +976,6 @@ void fw_core_handle_response(struct fw_c break; } - del_timer_sync(&t->split_timeout_timer); - /* * The response handler may be executed while the request handler * is still pending. Cancel the request handler. ^ permalink raw reply [flat|nested] 17+ messages in thread
* Re: lockdep false positive? -- firewire-core transaction timer vs. scsi-core host lock 2010-08-18 7:01 ` Clemens Ladisch @ 2010-08-18 8:09 ` Stefan Richter 2010-08-18 8:53 ` Clemens Ladisch 2010-08-18 9:08 ` Yong Zhang 2010-08-18 8:59 ` Peter Zijlstra 1 sibling, 2 replies; 17+ messages in thread From: Stefan Richter @ 2010-08-18 8:09 UTC (permalink / raw) To: Clemens Ladisch Cc: Yong Zhang, Peter Zijlstra, Johannes Berg, Thomas Gleixner, linux1394-devel, linux-kernel Clemens Ladisch wrote: > firewire: core: do not use del_timer_sync() in interrupt context > > Because we might be in interrupt context, replace del_timer_sync() with > del_timer(). Again, is the term "interrupt contexts" meant to include softIRQ contexts? (Nothing in firewire-core/-sbp2/-net etc. is called in hardware interrupts. BTW, we actually could and maybe should change all the spinlocks in them from spin_lock_irq to spin_lock_bh. Only firewire-ohci's local ohci->lock needs to be IRQ safe. OTOH this could change if somebody comes up with a migration of the stack to threaded IRQ handling.) -- Stefan Richter -=====-==-=- =--- =--=- http://arcgraph.de/sr/ ^ permalink raw reply [flat|nested] 17+ messages in thread
* Re: lockdep false positive? -- firewire-core transaction timer vs. scsi-core host lock 2010-08-18 8:09 ` Stefan Richter @ 2010-08-18 8:53 ` Clemens Ladisch 2010-08-18 9:08 ` Yong Zhang 1 sibling, 0 replies; 17+ messages in thread From: Clemens Ladisch @ 2010-08-18 8:53 UTC (permalink / raw) To: Stefan Richter Cc: Yong Zhang, Peter Zijlstra, Johannes Berg, Thomas Gleixner, linux1394-devel, linux-kernel Stefan Richter wrote: > Clemens Ladisch wrote: > > Because we might be in interrupt context, replace del_timer_sync() with > > del_timer(). > > Again, is the term "interrupt contexts" meant to include softIRQ contexts? Yes: http://www.kernel.org/pub/linux/kernel/people/rusty/kernel-locking/g618.html#GLOSS-INTERRUPTCONTEXT ^ permalink raw reply [flat|nested] 17+ messages in thread
* Re: lockdep false positive? -- firewire-core transaction timer vs. scsi-core host lock 2010-08-18 8:09 ` Stefan Richter 2010-08-18 8:53 ` Clemens Ladisch @ 2010-08-18 9:08 ` Yong Zhang 1 sibling, 0 replies; 17+ messages in thread From: Yong Zhang @ 2010-08-18 9:08 UTC (permalink / raw) To: Stefan Richter Cc: Clemens Ladisch, Peter Zijlstra, Johannes Berg, Thomas Gleixner, linux1394-devel, linux-kernel On Wed, Aug 18, 2010 at 4:09 PM, Stefan Richter <stefanr@s5r6.in-berlin.de> wrote: > Again, is the term "interrupt contexts" meant to include softIRQ contexts? I can't think out a scene which prevent del_timer_sync() from using in softirq. :( Then I go back and find that comments comes before linux-2.6.12, then in linux-2.6.12, del_timer_sync() is like this: int del_timer_sync(struct timer_list *timer) { tvec_base_t *base; int i, ret = 0; check_timer(timer); del_again: ret += del_timer(timer); for_each_online_cpu(i) { base = &per_cpu(tvec_bases, i); if (base->running_timer == timer) { while (base->running_timer == timer) { cpu_relax(); preempt_check_resched(); } break; } } smp_rmb(); if (timer_pending(timer)) goto del_again; return ret; } EXPORT_SYMBOL(del_timer_sync); Seems preempt_check_resched() is the proof of that comments, but now del_timer_sync() is changed. I don't know if there is any other restriction. Thanks, Yong > > (Nothing in firewire-core/-sbp2/-net etc. is called in hardware > interrupts. BTW, we actually could and maybe should change all the > spinlocks in them from spin_lock_irq to spin_lock_bh. Only > firewire-ohci's local ohci->lock needs to be IRQ safe. OTOH this could > change if somebody comes up with a migration of the stack to threaded > IRQ handling.) > -- > Stefan Richter > -=====-==-=- =--- =--=- > http://arcgraph.de/sr/ > ^ permalink raw reply [flat|nested] 17+ messages in thread
* Re: lockdep false positive? -- firewire-core transaction timer vs. scsi-core host lock 2010-08-18 7:01 ` Clemens Ladisch 2010-08-18 8:09 ` Stefan Richter @ 2010-08-18 8:59 ` Peter Zijlstra 2010-08-18 9:26 ` Clemens Ladisch 1 sibling, 1 reply; 17+ messages in thread From: Peter Zijlstra @ 2010-08-18 8:59 UTC (permalink / raw) To: Clemens Ladisch Cc: Stefan Richter, Yong Zhang, Johannes Berg, Thomas Gleixner, linux1394-devel, linux-kernel On Wed, 2010-08-18 at 09:01 +0200, Clemens Ladisch wrote: > +retry: > spin_lock_irqsave(&card->lock, flags); > list_for_each_entry(t, &card->transaction_list, link) { > if (t == transaction) { > + if (!del_timer(&t->split_timeout_timer)) { > + /* wait for the timer to cancel it */ > + spin_unlock_irqrestore(&card->lock, flags); > + cpu_relax(); > + goto retry; > + } Open-coding spin loops like that is really ugly, and could cause trouble for -rt. Also, I believe that if you want the very same semantics as before, you need to use try_to_del_timer_sync(), not del_timer(). Also, if del_timer_sync() is not allowed from any interrupt context (including softirq) then doing the spin-loop like that doesn't actually solve anything. Thomas, any comments? ^ permalink raw reply [flat|nested] 17+ messages in thread
* Re: lockdep false positive? -- firewire-core transaction timer vs. scsi-core host lock 2010-08-18 8:59 ` Peter Zijlstra @ 2010-08-18 9:26 ` Clemens Ladisch 2010-08-18 9:43 ` Peter Zijlstra 2010-08-18 12:50 ` Stefan Richter 0 siblings, 2 replies; 17+ messages in thread From: Clemens Ladisch @ 2010-08-18 9:26 UTC (permalink / raw) To: Peter Zijlstra, Stefan Richter Cc: Yong Zhang, Johannes Berg, Thomas Gleixner, linux1394-devel, linux-kernel Peter Zijlstra wrote: > On Wed, 2010-08-18 at 09:01 +0200, Clemens Ladisch wrote: > > +retry: > > spin_lock_irqsave(&card->lock, flags); > > list_for_each_entry(t, &card->transaction_list, link) { > > if (t == transaction) { > > + if (!del_timer(&t->split_timeout_timer)) { > > + /* wait for the timer to cancel it */ > > + spin_unlock_irqrestore(&card->lock, flags); > > + cpu_relax(); > > + goto retry; > > + } > > Open-coding spin loops like that is really ugly, and could cause trouble > for -rt. > > Also, I believe that if you want the very same semantics as before, you > need to use try_to_del_timer_sync(), not del_timer(). Like del_timer_sync(), it "must not be called from interrupt contexts." > Also, if del_timer_sync() is not allowed from any interrupt context > (including softirq) then doing the spin-loop like that doesn't actually > solve anything. Why? Anyway, my first patch was crap because the loop isn't actually necessary: --8<---------------------------------------------------------------->8-- firewire: core: do not use del_timer_sync() in interrupt context Because we might be in interrupt context, replace del_timer_sync() with del_timer(). If the timer is already running, we know that it will clean up the transaction, so we do not need to do any further processing in the normal transaction handler. Many thanks to Yong Zhang for diagnosing this. Reported-by: Stefan Richter <stefanr@s5r6.in-berlin.de> Signed-off-by: Clemens Ladisch <clemens@ladisch.de> --- drivers/firewire/core-transaction.c | 9 ++++++--- 1 file changed, 6 insertions(+), 3 deletions(-) --- a/drivers/firewire/core-transaction.c +++ b/drivers/firewire/core-transaction.c @@ -81,6 +81,8 @@ static int close_transaction(struct fw_transaction *transaction, spin_lock_irqsave(&card->lock, flags); list_for_each_entry(t, &card->transaction_list, link) { if (t == transaction) { + if (!del_timer(&t->split_timeout_timer)) + goto timed_out; list_del_init(&t->link); card->tlabel_mask &= ~(1ULL << t->tlabel); break; @@ -89,11 +91,11 @@ static int close_transaction(struct fw_transaction *transaction, spin_unlock_irqrestore(&card->lock, flags); if (&t->link != &card->transaction_list) { - del_timer_sync(&t->split_timeout_timer); t->callback(card, rcode, NULL, 0, t->callback_data); return 0; } +timed_out: return -ENOENT; } @@ -921,6 +923,8 @@ void fw_core_handle_response(struct fw_card *card, struct fw_packet *p) spin_lock_irqsave(&card->lock, flags); list_for_each_entry(t, &card->transaction_list, link) { if (t->node_id == source && t->tlabel == tlabel) { + if (!del_timer(&t->split_timeout_timer)) + goto timed_out; list_del_init(&t->link); card->tlabel_mask &= ~(1ULL << t->tlabel); break; @@ -929,6 +933,7 @@ void fw_core_handle_response(struct fw_card *card, struct fw_packet *p) spin_unlock_irqrestore(&card->lock, flags); if (&t->link == &card->transaction_list) { +timed_out: fw_notify("Unsolicited response (source %x, tlabel %x)\n", source, tlabel); return; @@ -963,8 +968,6 @@ void fw_core_handle_response(struct fw_card *card, struct fw_packet *p) break; } - del_timer_sync(&t->split_timeout_timer); - /* * The response handler may be executed while the request handler * is still pending. Cancel the request handler. ^ permalink raw reply [flat|nested] 17+ messages in thread
* Re: lockdep false positive? -- firewire-core transaction timer vs. scsi-core host lock 2010-08-18 9:26 ` Clemens Ladisch @ 2010-08-18 9:43 ` Peter Zijlstra 2010-08-18 12:50 ` Stefan Richter 1 sibling, 0 replies; 17+ messages in thread From: Peter Zijlstra @ 2010-08-18 9:43 UTC (permalink / raw) To: Clemens Ladisch Cc: Stefan Richter, Yong Zhang, Johannes Berg, Thomas Gleixner, linux1394-devel, linux-kernel On Wed, 2010-08-18 at 11:26 +0200, Clemens Ladisch wrote: > Anyway, my first patch was crap because the loop isn't actually > necessary. Ok, without loop its all fine with me ;-) ^ permalink raw reply [flat|nested] 17+ messages in thread
* Re: lockdep false positive? -- firewire-core transaction timer vs. scsi-core host lock 2010-08-18 9:26 ` Clemens Ladisch 2010-08-18 9:43 ` Peter Zijlstra @ 2010-08-18 12:50 ` Stefan Richter 2010-08-18 13:05 ` Clemens Ladisch 1 sibling, 1 reply; 17+ messages in thread From: Stefan Richter @ 2010-08-18 12:50 UTC (permalink / raw) To: Clemens Ladisch Cc: Peter Zijlstra, Yong Zhang, Johannes Berg, Thomas Gleixner, linux1394-devel, linux-kernel Clemens Ladisch wrote: > firewire: core: do not use del_timer_sync() in interrupt context > > Because we might be in interrupt context, replace del_timer_sync() with > del_timer(). OK. And thanks for the pointers into the locking guide. > If the timer is already running, we know that it will > clean up the transaction, so we do not need to do any further processing > in the normal transaction handler. > > Many thanks to Yong Zhang for diagnosing this. > > Reported-by: Stefan Richter <stefanr@s5r6.in-berlin.de> > Signed-off-by: Clemens Ladisch <clemens@ladisch.de> > --- > drivers/firewire/core-transaction.c | 9 ++++++--- > 1 file changed, 6 insertions(+), 3 deletions(-) > > --- a/drivers/firewire/core-transaction.c > +++ b/drivers/firewire/core-transaction.c > @@ -81,6 +81,8 @@ static int close_transaction(struct fw_transaction *transaction, > spin_lock_irqsave(&card->lock, flags); > list_for_each_entry(t, &card->transaction_list, link) { > if (t == transaction) { > + if (!del_timer(&t->split_timeout_timer)) > + goto timed_out; + if (!del_timer(&t->split_timeout_timer)) { + spin_unlock_irqrestore(&card->lock, flags); + goto timed_out; + } > list_del_init(&t->link); > card->tlabel_mask &= ~(1ULL << t->tlabel); > break; > @@ -89,11 +91,11 @@ static int close_transaction(struct fw_transaction *transaction, > spin_unlock_irqrestore(&card->lock, flags); > > if (&t->link != &card->transaction_list) { > - del_timer_sync(&t->split_timeout_timer); > t->callback(card, rcode, NULL, 0, t->callback_data); > return 0; > } > > +timed_out: > return -ENOENT; > } > > @@ -921,6 +923,8 @@ void fw_core_handle_response(struct fw_card *card, struct fw_packet *p) > spin_lock_irqsave(&card->lock, flags); > list_for_each_entry(t, &card->transaction_list, link) { > if (t->node_id == source && t->tlabel == tlabel) { > + if (!del_timer(&t->split_timeout_timer)) > + goto timed_out; Ditto. > list_del_init(&t->link); > card->tlabel_mask &= ~(1ULL << t->tlabel); > break; > @@ -929,6 +933,7 @@ void fw_core_handle_response(struct fw_card *card, struct fw_packet *p) > spin_unlock_irqrestore(&card->lock, flags); > > if (&t->link == &card->transaction_list) { > +timed_out: > fw_notify("Unsolicited response (source %x, tlabel %x)\n", > source, tlabel); > return; > @@ -963,8 +968,6 @@ void fw_core_handle_response(struct fw_card *card, struct fw_packet *p) > break; > } > > - del_timer_sync(&t->split_timeout_timer); > - > /* > * The response handler may be executed while the request handler > * is still pending. Cancel the request handler. Shall I commit with the added spin_unlocks? Or do you prefer to send an update? -- Stefan Richter -=====-==-=- =--- =--=- http://arcgraph.de/sr/ ^ permalink raw reply [flat|nested] 17+ messages in thread
* Re: lockdep false positive? -- firewire-core transaction timer vs. scsi-core host lock 2010-08-18 12:50 ` Stefan Richter @ 2010-08-18 13:05 ` Clemens Ladisch 0 siblings, 0 replies; 17+ messages in thread From: Clemens Ladisch @ 2010-08-18 13:05 UTC (permalink / raw) To: Stefan Richter Cc: Peter Zijlstra, Yong Zhang, Johannes Berg, Thomas Gleixner, linux1394-devel, linux-kernel Stefan Richter wrote: > Clemens Ladisch wrote: > > spin_lock_irqsave(&card->lock, flags); > > list_for_each_entry(t, &card->transaction_list, link) { > > if (t == transaction) { > > + if (!del_timer(&t->split_timeout_timer)) > > + goto timed_out; > > + if (!del_timer(&t->split_timeout_timer)) { > + spin_unlock_irqrestore(&card->lock, flags); > + goto timed_out; > + } > Shall I commit with the added spin_unlocks? I can obviously not be trusted to do this correctly; please go ahead. Regards, Clemens ^ permalink raw reply [flat|nested] 17+ messages in thread
end of thread, other threads:[~2010-08-18 13:18 UTC | newest] Thread overview: 17+ messages (download: mbox.gz follow: Atom feed -- links below jump to the message on this page -- 2010-08-16 18:42 lockdep false positive? -- firewire-core transaction timer vs. scsi-core host lock Stefan Richter 2010-08-16 21:42 ` Peter Zijlstra 2010-08-16 22:09 ` Johannes Berg 2010-08-16 22:27 ` Johannes Berg 2010-08-17 16:22 ` Stefan Richter 2010-08-18 13:17 ` Stefan Richter 2010-08-17 14:35 ` Yong Zhang 2010-08-17 16:23 ` Stefan Richter 2010-08-18 7:01 ` Clemens Ladisch 2010-08-18 8:09 ` Stefan Richter 2010-08-18 8:53 ` Clemens Ladisch 2010-08-18 9:08 ` Yong Zhang 2010-08-18 8:59 ` Peter Zijlstra 2010-08-18 9:26 ` Clemens Ladisch 2010-08-18 9:43 ` Peter Zijlstra 2010-08-18 12:50 ` Stefan Richter 2010-08-18 13:05 ` Clemens Ladisch
This is a public inbox, see mirroring instructions for how to clone and mirror all data and code used for this inbox