* spinlock recursion in scsi_end_request() (kernel 2.6.24)
@ 2010-05-17 11:32 Prashant
2010-05-20 10:02 ` Tejun Heo
0 siblings, 1 reply; 5+ messages in thread
From: Prashant @ 2010-05-17 11:32 UTC (permalink / raw)
To: linux-ide
Hi,
I have a board with backplane for SATA disks. Sometimes when I unplug
a disk while
IO is going on, I get following problem. Has anybody of you
experienced this before?
Sometimes spinlock owner is kblockd, sometimes it is scsi_eh.
sd 7:0:0:0: rejecting I/O to offline device
sd 7:0:0:0: rejecting I/O to offline device
sd 7:0:0:0: [sda] Result: hostbyte=0x01 driverbyte=0x00
BUG: spinlock recursion on CPU#0, kblockd/0/54
lock: cf53a138, .magic: dead4ead, .owner: kblockd/0/54, .owner_cpu: 0
[<c033af90>] (dump_stack+0x0/0x14) from [<c013f710>] (spin_bug+0x90/0xa4)
[<c013f680>] (spin_bug+0x0/0xa4) from [<c013f870>] (_raw_spin_lock+0x50/0x15c)
r5:20000013 r4:cf53a138
[<c013f820>] (_raw_spin_lock+0x0/0x15c) from [<c029e390>]
(_spin_lock_irqsave+0x2c/0x34)
[<c029e364>] (_spin_lock_irqsave+0x0/0x34) from [<c01c0794>]
(scsi_end_request+0x94/0xdc)
r5:cfffe3e0 r4:cf539e84
[<c01c0700>] (scsi_end_request+0x0/0xdc) from [<c01c0d10>]
(scsi_io_completion+0x314/0x330)
r8:00000000 r7:cf539e84 r6:00010000 r5:cfffe3e0 r4:00000000
[<c01c09fc>] (scsi_io_completion+0x0/0x330) from [<c01bb744>]
(scsi_finish_command+0x84/0x88)
[<c01bb6c0>] (scsi_finish_command+0x0/0x88) from [<c01c1394>]
(scsi_softirq_done+0xbc/0x114)
r6:cfffe3e0 r5:00000005 r4:00000bb8
[<c01c12d8>] (scsi_softirq_done+0x0/0x114) from [<c01282a0>]
(blk_done_softirq+0x7c/0x9c)
r7:c0373760 r6:0000000a r5:00000001 r4:cfcafcd0
[<c0128224>] (blk_done_softirq+0x0/0x9c) from [<c0053f58>]
(__do_softirq+0x64/0xd0)
r4:c03737bc
[<c0053ef4>] (__do_softirq+0x0/0xd0) from [<c005406c>] (do_softirq+0x4c/0x68)
Thanks,
prashant
^ permalink raw reply [flat|nested] 5+ messages in thread
* Re: spinlock recursion in scsi_end_request() (kernel 2.6.24)
2010-05-17 11:32 spinlock recursion in scsi_end_request() (kernel 2.6.24) Prashant
@ 2010-05-20 10:02 ` Tejun Heo
2010-05-20 11:33 ` Prashant
0 siblings, 1 reply; 5+ messages in thread
From: Tejun Heo @ 2010-05-20 10:02 UTC (permalink / raw)
To: Prashant; +Cc: linux-ide
On 05/17/2010 01:32 PM, Prashant wrote:
> Hi,
> I have a board with backplane for SATA disks. Sometimes when I unplug
> a disk while
> IO is going on, I get following problem. Has anybody of you
> experienced this before?
> Sometimes spinlock owner is kblockd, sometimes it is scsi_eh.
Sorry but 2.6.24 is way too ancient at this point and too much has
changed. Can you please try a recent kernel and see whether the
problem is reproducible?
Thanks.
--
tejun
^ permalink raw reply [flat|nested] 5+ messages in thread
* Re: spinlock recursion in scsi_end_request() (kernel 2.6.24)
2010-05-20 10:02 ` Tejun Heo
@ 2010-05-20 11:33 ` Prashant
2010-05-20 15:05 ` Tejun Heo
0 siblings, 1 reply; 5+ messages in thread
From: Prashant @ 2010-05-20 11:33 UTC (permalink / raw)
To: Tejun Heo; +Cc: linux-ide
Hi,
On Thu, May 20, 2010 at 3:32 PM, Tejun Heo <tj@kernel.org> wrote:
> On 05/17/2010 01:32 PM, Prashant wrote:
>> Hi,
>> I have a board with backplane for SATA disks. Sometimes when I unplug
>> a disk while
>> IO is going on, I get following problem. Has anybody of you
>> experienced this before?
>> Sometimes spinlock owner is kblockd, sometimes it is scsi_eh.
>
> Sorry but 2.6.24 is way too ancient at this point and too much has
> changed. Can you please try a recent kernel and see whether the
> problem is reproducible?
>
Okay I 'll update If I get same problem with latest kernel (will take
lot of time).
I have a question related to code which is almost same in the current kernel.
I don't know whether this is the right mailing list for the following question.
When a sata drive is unplugged, its corresponding sdev's state is set
to SDEV_OFFLINE. Now if IO requests are still comming on the same device,
They will be killed by calling scsi_kill_request().
1) scsi_kill_request does following things:
i) Unlock request queue
ii) Increment host_busy count
iii) Lock request queue
iv) Calls __scsi_done()
2) __scsi_done() does following things:
i) set request completion data
ii) Calls blk_completion_request()
3) blk_completion_request() does following things:
i) Adds request->donelist to blk_cpu_done softirq queue
and raise the softirq (which is scsi_softirq_done)
4) next sequence is:
scsi_softirq_done >> scsi_finish_command >> scsi_device_unbusy()
5) scsi_device_unbusy() again locks the request_queue. This is the place where
we can get into the spinlock recursion.
Is this correct? Please correct me if something is wrong.
Thanks,
Prashant
^ permalink raw reply [flat|nested] 5+ messages in thread
* Re: spinlock recursion in scsi_end_request() (kernel 2.6.24)
2010-05-20 11:33 ` Prashant
@ 2010-05-20 15:05 ` Tejun Heo
2010-05-20 15:29 ` James Bottomley
0 siblings, 1 reply; 5+ messages in thread
From: Tejun Heo @ 2010-05-20 15:05 UTC (permalink / raw)
To: Prashant; +Cc: linux-ide, linux-scsi
Hello,
On 05/20/2010 01:33 PM, Prashant wrote:
> I have a question related to code which is almost same in the
> current kernel. I don't know whether this is the right mailing list
> for the following question.
linux-scsi would probably fit better (cc'd).
> When a sata drive is unplugged, its corresponding sdev's state is set
> to SDEV_OFFLINE. Now if IO requests are still comming on the same device,
> They will be killed by calling scsi_kill_request().
>
> 1) scsi_kill_request does following things:
> i) Unlock request queue
> ii) Increment host_busy count
> iii) Lock request queue
> iv) Calls __scsi_done()
>
> 2) __scsi_done() does following things:
> i) set request completion data
> ii) Calls blk_completion_request()
>
> 3) blk_completion_request() does following things:
> i) Adds request->donelist to blk_cpu_done softirq queue
> and raise the softirq (which is scsi_softirq_done)
>
> 4) next sequence is:
> scsi_softirq_done >> scsi_finish_command >> scsi_device_unbusy()
>
> 5) scsi_device_unbusy() again locks the request_queue. This is the place where
> we can get into the spinlock recursion.
>
> Is this correct? Please correct me if something is wrong.
Raising softirq defers the work to another context and grabbing the
same lock from softirq handler doesn't constitute a recursive locking.
Please try to reproduce the problem on recent kernel w/ lockdep
enabled.
Thanks.
--
tejun
^ permalink raw reply [flat|nested] 5+ messages in thread
* Re: spinlock recursion in scsi_end_request() (kernel 2.6.24)
2010-05-20 15:05 ` Tejun Heo
@ 2010-05-20 15:29 ` James Bottomley
0 siblings, 0 replies; 5+ messages in thread
From: James Bottomley @ 2010-05-20 15:29 UTC (permalink / raw)
To: Tejun Heo; +Cc: Prashant, linux-ide, linux-scsi
On Thu, 2010-05-20 at 17:05 +0200, Tejun Heo wrote:
> Hello,
>
> On 05/20/2010 01:33 PM, Prashant wrote:
> > I have a question related to code which is almost same in the
> > current kernel. I don't know whether this is the right mailing list
> > for the following question.
>
> linux-scsi would probably fit better (cc'd).
>
> > When a sata drive is unplugged, its corresponding sdev's state is set
> > to SDEV_OFFLINE. Now if IO requests are still comming on the same device,
> > They will be killed by calling scsi_kill_request().
> >
> > 1) scsi_kill_request does following things:
> > i) Unlock request queue
> > ii) Increment host_busy count
> > iii) Lock request queue
> > iv) Calls __scsi_done()
> >
> > 2) __scsi_done() does following things:
> > i) set request completion data
> > ii) Calls blk_completion_request()
> >
> > 3) blk_completion_request() does following things:
> > i) Adds request->donelist to blk_cpu_done softirq queue
> > and raise the softirq (which is scsi_softirq_done)
> >
> > 4) next sequence is:
> > scsi_softirq_done >> scsi_finish_command >> scsi_device_unbusy()
> >
> > 5) scsi_device_unbusy() again locks the request_queue. This is the place where
> > we can get into the spinlock recursion.
> >
> > Is this correct? Please correct me if something is wrong.
>
> Raising softirq defers the work to another context and grabbing the
> same lock from softirq handler doesn't constitute a recursive locking.
> Please try to reproduce the problem on recent kernel w/ lockdep
> enabled.
Just to confirm what Tejun says: the design of the cmd -> done (i.e.
scsi_done) going through the block sofirq handler is specifically so it
can be called either locked or unlocked, so this can never be a
recursion.
James
^ permalink raw reply [flat|nested] 5+ messages in thread
end of thread, other threads:[~2010-05-20 15:29 UTC | newest]
Thread overview: 5+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2010-05-17 11:32 spinlock recursion in scsi_end_request() (kernel 2.6.24) Prashant
2010-05-20 10:02 ` Tejun Heo
2010-05-20 11:33 ` Prashant
2010-05-20 15:05 ` Tejun Heo
2010-05-20 15:29 ` James Bottomley
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).