public inbox for linux-scsi@vger.kernel.org
 help / color / mirror / Atom feed
* RE: [Emulex] queuecommands and GFP_ATOMIC
@ 2004-06-08 18:21 Smart, James
  2004-06-08 18:23 ` 'Christoph Hellwig'
  0 siblings, 1 reply; 4+ messages in thread
From: Smart, James @ 2004-06-08 18:21 UTC (permalink / raw)
  To: 'Christoph Hellwig'
  Cc: Anton Blanchard, 'linux-scsi@vger.kernel.org'

So we've done some testing with the locks moved and the GFP_xxx flag
changed...

On a RedHat 3.90 2.6.5 kernel, we hit the following:

Debug: sleeping function called from invalid context at mm/slab.c:1980
in_atomic():1, irqs_disabled():0
Call Trace:
 [<0211c5a1>] __might_sleep+0x80/0x8a
 [<0213ba96>] kmem_cache_alloc+0x1b/0x50
 [<82ac32e8>] lpfc_get_scsi_buf+0x20/0x12e [lpfc]
 [<82ac45b5>] lpfc_queuecommand+0x2c/0x175 [lpfc]
 [<82834864>] scsi_done+0x0/0x60 [scsi_mod]
 [<828346a7>] scsi_dispatch_cmd+0x1bb/0x220 [scsi_mod]
 [<828390d1>] scsi_request_fn+0x292/0x31e [scsi_mod]
 [<021f3b5c>] blk_run_queue+0x25/0x34
 [<82838679>] scsi_end_request+0x9d/0xa4 [scsi_mod]
 [<828389d6>] scsi_io_completion+0x225/0x427 [scsi_mod]
 [<82834a59>] scsi_finish_command+0xac/0xb1 [scsi_mod]
 [<8283497a>] scsi_softirq+0xb6/0xc3 [scsi_mod]
 [<02121d00>] __do_softirq+0x48/0x9d
 [<02107f91>] do_softirq+0x4f/0x56

Given this stack trace, any GFP_xxx flag that has __GFP_WAIT set is at risk
(includes GFP_NOIO).

So, the only real solution looks to be to go back to the local memory pools
in queuecommand, failing back to the midlayer upon alloc failure.

Any other suggestions?

-- James


> -----Original Message-----
> From: 'Christoph Hellwig' [mailto:hch@infradead.org]
> Sent: Tuesday, June 08, 2004 1:03 PM
> To: Smart, James
> Cc: Anton Blanchard; 'linux-scsi@vger.kernel.org'
> Subject: Re: Emulex lpfcdriver v8.0.2 available
> 
> 
> On Tue, Jun 08, 2004 at 10:54:27AM -0400, Smart, James wrote:
> > FYI - and status update.
> > 
> > A malloc or two was condense on the fast path. However, the 
> reorg so alloc
> > was not under lock is not done yet. Should be in the next drop. In
> > queuecommand, we will: unload the host lock, perform all 
> allocations and do
> > prep data, then take the driver lock, complete state checks 
> and submit to
> > hardware. Should allow all the ATOMIC allocations to go 
> back to normal for
> > the queuecommand sequence.
> 
> Not quite normal, it still needs to be GFP_NOIO to avoid deadlocks.
> 

^ permalink raw reply	[flat|nested] 4+ messages in thread

* Re: [Emulex] queuecommands and GFP_ATOMIC
  2004-06-08 18:21 Smart, James
@ 2004-06-08 18:23 ` 'Christoph Hellwig'
  0 siblings, 0 replies; 4+ messages in thread
From: 'Christoph Hellwig' @ 2004-06-08 18:23 UTC (permalink / raw)
  To: Smart, James
  Cc: 'Christoph Hellwig', Anton Blanchard,
	'linux-scsi@vger.kernel.org'

On Tue, Jun 08, 2004 at 02:21:05PM -0400, Smart, James wrote:
> So we've done some testing with the locks moved and the GFP_xxx flag
> changed...
> 
> On a RedHat 3.90 2.6.5 kernel, we hit the following:
> 
> Debug: sleeping function called from invalid context at mm/slab.c:1980
> in_atomic():1, irqs_disabled():0
> Call Trace:

Looks like you used spin_unlock instead of spin_unlock_irq for the host
lock.

^ permalink raw reply	[flat|nested] 4+ messages in thread

* RE: [Emulex] queuecommands and GFP_ATOMIC
@ 2004-06-08 19:06 Smart, James
  2004-06-08 19:09 ` 'Christoph Hellwig'
  0 siblings, 1 reply; 4+ messages in thread
From: Smart, James @ 2004-06-08 19:06 UTC (permalink / raw)
  To: 'Christoph Hellwig'
  Cc: Anton Blanchard, 'linux-scsi@vger.kernel.org'

Actually, we are using spin_unlock_irq().  We're looking further...  I take
it from your response, this calling sequence should not have been an issue ?

-- james


> -----Original Message-----
> From: 'Christoph Hellwig' [mailto:hch@infradead.org]
> Sent: Tuesday, June 08, 2004 2:24 PM
> To: Smart, James
> Cc: 'Christoph Hellwig'; Anton Blanchard; 'linux-scsi@vger.kernel.org'
> Subject: Re: [Emulex] queuecommands and GFP_ATOMIC
> 
> 
> On Tue, Jun 08, 2004 at 02:21:05PM -0400, Smart, James wrote:
> > So we've done some testing with the locks moved and the GFP_xxx flag
> > changed...
> > 
> > On a RedHat 3.90 2.6.5 kernel, we hit the following:
> > 
> > Debug: sleeping function called from invalid context at 
> mm/slab.c:1980
> > in_atomic():1, irqs_disabled():0
> > Call Trace:
> 
> Looks like you used spin_unlock instead of spin_unlock_irq 
> for the host
> lock.
> 

^ permalink raw reply	[flat|nested] 4+ messages in thread

* Re: [Emulex] queuecommands and GFP_ATOMIC
  2004-06-08 19:06 [Emulex] queuecommands and GFP_ATOMIC Smart, James
@ 2004-06-08 19:09 ` 'Christoph Hellwig'
  0 siblings, 0 replies; 4+ messages in thread
From: 'Christoph Hellwig' @ 2004-06-08 19:09 UTC (permalink / raw)
  To: Smart, James
  Cc: 'Christoph Hellwig', Anton Blanchard,
	'linux-scsi@vger.kernel.org'

On Tue, Jun 08, 2004 at 03:06:58PM -0400, Smart, James wrote:
> Actually, we are using spin_unlock_irq().  We're looking further...  I take
> it from your response, this calling sequence should not have been an issue ?

Umm, sorry.  I misread your mail, it clearly states:

> > > Debug: sleeping function called from invalid context at 
> > mm/slab.c:1980
> > > in_atomic():1, irqs_disabled():0
> > > Call Trace:

so you do have irqs enabled but are still in some critical section.

Can you post a snapshot of the codebase you're testing with?


^ permalink raw reply	[flat|nested] 4+ messages in thread

end of thread, other threads:[~2004-06-08 19:09 UTC | newest]

Thread overview: 4+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2004-06-08 19:06 [Emulex] queuecommands and GFP_ATOMIC Smart, James
2004-06-08 19:09 ` 'Christoph Hellwig'
  -- strict thread matches above, loose matches on Subject: below --
2004-06-08 18:21 Smart, James
2004-06-08 18:23 ` 'Christoph Hellwig'

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox