linux-ide.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* spinlock recursion in scsi_end_request() (kernel 2.6.24)
@ 2010-05-17 11:32 Prashant
  2010-05-20 10:02 ` Tejun Heo
  0 siblings, 1 reply; 5+ messages in thread
From: Prashant @ 2010-05-17 11:32 UTC (permalink / raw)
  To: linux-ide

Hi,
I have a board with backplane for SATA disks. Sometimes when  I unplug
a disk while
IO is going on, I get following problem. Has anybody of you
experienced this before?
Sometimes spinlock owner is kblockd, sometimes it is scsi_eh.

sd 7:0:0:0: rejecting I/O to offline device
sd 7:0:0:0: rejecting I/O to offline device
sd 7:0:0:0: [sda] Result: hostbyte=0x01 driverbyte=0x00
BUG: spinlock recursion on CPU#0, kblockd/0/54
 lock: cf53a138, .magic: dead4ead, .owner: kblockd/0/54, .owner_cpu: 0
[<c033af90>] (dump_stack+0x0/0x14) from [<c013f710>] (spin_bug+0x90/0xa4)
[<c013f680>] (spin_bug+0x0/0xa4) from [<c013f870>] (_raw_spin_lock+0x50/0x15c)
 r5:20000013 r4:cf53a138
[<c013f820>] (_raw_spin_lock+0x0/0x15c) from [<c029e390>]
(_spin_lock_irqsave+0x2c/0x34)
[<c029e364>] (_spin_lock_irqsave+0x0/0x34) from [<c01c0794>]
(scsi_end_request+0x94/0xdc)
 r5:cfffe3e0 r4:cf539e84
[<c01c0700>] (scsi_end_request+0x0/0xdc) from [<c01c0d10>]
(scsi_io_completion+0x314/0x330)
 r8:00000000 r7:cf539e84 r6:00010000 r5:cfffe3e0 r4:00000000
[<c01c09fc>] (scsi_io_completion+0x0/0x330) from [<c01bb744>]
(scsi_finish_command+0x84/0x88)
[<c01bb6c0>] (scsi_finish_command+0x0/0x88) from [<c01c1394>]
(scsi_softirq_done+0xbc/0x114)
 r6:cfffe3e0 r5:00000005 r4:00000bb8
[<c01c12d8>] (scsi_softirq_done+0x0/0x114) from [<c01282a0>]
(blk_done_softirq+0x7c/0x9c)
 r7:c0373760 r6:0000000a r5:00000001 r4:cfcafcd0
[<c0128224>] (blk_done_softirq+0x0/0x9c) from [<c0053f58>]
(__do_softirq+0x64/0xd0)
 r4:c03737bc
[<c0053ef4>] (__do_softirq+0x0/0xd0) from [<c005406c>] (do_softirq+0x4c/0x68)


Thanks,
prashant

^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: spinlock recursion in scsi_end_request() (kernel 2.6.24)
  2010-05-17 11:32 spinlock recursion in scsi_end_request() (kernel 2.6.24) Prashant
@ 2010-05-20 10:02 ` Tejun Heo
  2010-05-20 11:33   ` Prashant
  0 siblings, 1 reply; 5+ messages in thread
From: Tejun Heo @ 2010-05-20 10:02 UTC (permalink / raw)
  To: Prashant; +Cc: linux-ide

On 05/17/2010 01:32 PM, Prashant wrote:
> Hi,
> I have a board with backplane for SATA disks. Sometimes when  I unplug
> a disk while
> IO is going on, I get following problem. Has anybody of you
> experienced this before?
> Sometimes spinlock owner is kblockd, sometimes it is scsi_eh.

Sorry but 2.6.24 is way too ancient at this point and too much has
changed.  Can you please try a recent kernel and see whether the
problem is reproducible?

Thanks.

-- 
tejun

^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: spinlock recursion in scsi_end_request() (kernel 2.6.24)
  2010-05-20 10:02 ` Tejun Heo
@ 2010-05-20 11:33   ` Prashant
  2010-05-20 15:05     ` Tejun Heo
  0 siblings, 1 reply; 5+ messages in thread
From: Prashant @ 2010-05-20 11:33 UTC (permalink / raw)
  To: Tejun Heo; +Cc: linux-ide

Hi,

On Thu, May 20, 2010 at 3:32 PM, Tejun Heo <tj@kernel.org> wrote:
> On 05/17/2010 01:32 PM, Prashant wrote:
>> Hi,
>> I have a board with backplane for SATA disks. Sometimes when  I unplug
>> a disk while
>> IO is going on, I get following problem. Has anybody of you
>> experienced this before?
>> Sometimes spinlock owner is kblockd, sometimes it is scsi_eh.
>
> Sorry but 2.6.24 is way too ancient at this point and too much has
> changed.  Can you please try a recent kernel and see whether the
> problem is reproducible?
>
Okay I 'll update If I get same problem with latest kernel (will take
lot of time).

I have a question related to code which is almost same in the current kernel.
I don't know whether this is the right mailing list for the following question.

When a sata drive is unplugged, its corresponding sdev's state is set
to SDEV_OFFLINE. Now if IO requests are still comming on the same device,
They will be killed by calling scsi_kill_request().

1) scsi_kill_request does following things:
    i) Unlock request queue
    ii) Increment host_busy count
    iii) Lock request queue
    iv) Calls __scsi_done()

2) __scsi_done() does following things:
     i) set request completion data
     ii) Calls blk_completion_request()

3) blk_completion_request() does following things:
     i) Adds request->donelist to blk_cpu_done softirq queue
        and raise the softirq (which is scsi_softirq_done)

4) next sequence is:
    scsi_softirq_done >> scsi_finish_command >> scsi_device_unbusy()

5) scsi_device_unbusy() again locks the request_queue. This is the place where
    we can get into the spinlock recursion.

    Is this correct? Please correct me if something is wrong.

Thanks,
Prashant

^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: spinlock recursion in scsi_end_request() (kernel 2.6.24)
  2010-05-20 11:33   ` Prashant
@ 2010-05-20 15:05     ` Tejun Heo
  2010-05-20 15:29       ` James Bottomley
  0 siblings, 1 reply; 5+ messages in thread
From: Tejun Heo @ 2010-05-20 15:05 UTC (permalink / raw)
  To: Prashant; +Cc: linux-ide, linux-scsi

Hello,

On 05/20/2010 01:33 PM, Prashant wrote:
> I have a question related to code which is almost same in the
> current kernel.  I don't know whether this is the right mailing list
> for the following question.

linux-scsi would probably fit better (cc'd).

> When a sata drive is unplugged, its corresponding sdev's state is set
> to SDEV_OFFLINE. Now if IO requests are still comming on the same device,
> They will be killed by calling scsi_kill_request().
> 
> 1) scsi_kill_request does following things:
>     i) Unlock request queue
>     ii) Increment host_busy count
>     iii) Lock request queue
>     iv) Calls __scsi_done()
> 
> 2) __scsi_done() does following things:
>      i) set request completion data
>      ii) Calls blk_completion_request()
> 
> 3) blk_completion_request() does following things:
>      i) Adds request->donelist to blk_cpu_done softirq queue
>         and raise the softirq (which is scsi_softirq_done)
> 
> 4) next sequence is:
>     scsi_softirq_done >> scsi_finish_command >> scsi_device_unbusy()
> 
> 5) scsi_device_unbusy() again locks the request_queue. This is the place where
>     we can get into the spinlock recursion.
> 
>     Is this correct? Please correct me if something is wrong.

Raising softirq defers the work to another context and grabbing the
same lock from softirq handler doesn't constitute a recursive locking.
Please try to reproduce the problem on recent kernel w/ lockdep
enabled.

Thanks.

-- 
tejun

^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: spinlock recursion in scsi_end_request() (kernel 2.6.24)
  2010-05-20 15:05     ` Tejun Heo
@ 2010-05-20 15:29       ` James Bottomley
  0 siblings, 0 replies; 5+ messages in thread
From: James Bottomley @ 2010-05-20 15:29 UTC (permalink / raw)
  To: Tejun Heo; +Cc: Prashant, linux-ide, linux-scsi

On Thu, 2010-05-20 at 17:05 +0200, Tejun Heo wrote:
> Hello,
> 
> On 05/20/2010 01:33 PM, Prashant wrote:
> > I have a question related to code which is almost same in the
> > current kernel.  I don't know whether this is the right mailing list
> > for the following question.
> 
> linux-scsi would probably fit better (cc'd).
> 
> > When a sata drive is unplugged, its corresponding sdev's state is set
> > to SDEV_OFFLINE. Now if IO requests are still comming on the same device,
> > They will be killed by calling scsi_kill_request().
> > 
> > 1) scsi_kill_request does following things:
> >     i) Unlock request queue
> >     ii) Increment host_busy count
> >     iii) Lock request queue
> >     iv) Calls __scsi_done()
> > 
> > 2) __scsi_done() does following things:
> >      i) set request completion data
> >      ii) Calls blk_completion_request()
> > 
> > 3) blk_completion_request() does following things:
> >      i) Adds request->donelist to blk_cpu_done softirq queue
> >         and raise the softirq (which is scsi_softirq_done)
> > 
> > 4) next sequence is:
> >     scsi_softirq_done >> scsi_finish_command >> scsi_device_unbusy()
> > 
> > 5) scsi_device_unbusy() again locks the request_queue. This is the place where
> >     we can get into the spinlock recursion.
> > 
> >     Is this correct? Please correct me if something is wrong.
> 
> Raising softirq defers the work to another context and grabbing the
> same lock from softirq handler doesn't constitute a recursive locking.
> Please try to reproduce the problem on recent kernel w/ lockdep
> enabled.

Just to confirm what Tejun says: the design of the cmd -> done (i.e.
scsi_done) going through the block sofirq handler is specifically so it
can be called either locked or unlocked, so this can never be a
recursion.

James



^ permalink raw reply	[flat|nested] 5+ messages in thread

end of thread, other threads:[~2010-05-20 15:29 UTC | newest]

Thread overview: 5+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2010-05-17 11:32 spinlock recursion in scsi_end_request() (kernel 2.6.24) Prashant
2010-05-20 10:02 ` Tejun Heo
2010-05-20 11:33   ` Prashant
2010-05-20 15:05     ` Tejun Heo
2010-05-20 15:29       ` James Bottomley

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).