SCSI Hardware Handler and slow failover with large number of LUNS

linux-scsi.vger.kernel.org archive mirror
 help / color / mirror / Atom feed

* SCSI Hardware Handler and slow failover with large number of LUNS
@ 2009-04-04 21:30 Chandra Seetharaman
  2009-04-06 14:44 ` Moger, Babu
                   ` (2 more replies)
  0 siblings, 3 replies; 9+ messages in thread
From: Chandra Seetharaman @ 2009-04-04 21:30 UTC (permalink / raw)
  To: dm-devel, Linux SCSI Mailing list; +Cc: Moger, Babu

Hello All,

During testing with the latest SCSI DH Handler on a rdac storage, Babu
found that the failover time with 100+ luns takes about 15 minutes,
which is not good.

We found that the problem is due to the fact that we serialize activate
in dm on the work queue.

We can solve the problem in rdac handler in 2 ways
 1. batch up the activates (mode_selects) and send few of them.
 2. Do mode selects in async mode.

Just wondering if anybody had seen the same problem in other storages
(EMC, HP and Alua). 

Please share your experiences, so we can come up with a solution that
works for all hardware handlers.

regards,

chandra

^ permalink raw reply	[flat|nested] 9+ messages in thread

* RE: SCSI Hardware Handler and slow failover with large number of LUNS
  2009-04-04 21:30 SCSI Hardware Handler and slow failover with large number of LUNS Chandra Seetharaman
@ 2009-04-06 14:44 ` Moger, Babu
  2009-04-06 15:38 ` berthiaume_wayne
  2009-04-06 15:43 ` SCSI Hardware Handler and slow failover with large number of LUNS Mike Christie
  2 siblings, 0 replies; 9+ messages in thread
From: Moger, Babu @ 2009-04-06 14:44 UTC (permalink / raw)
  To: sekharan@linux.vnet.ibm.com, dm-devel, Linux SCSI Mailing list

Thanks Chandra for bringing this up.

Yes this will be a major drawback for the users who use rdac handler for their storage with large configurations.

Thanks
Babu Moger 

-----Original Message-----
From: Chandra Seetharaman [mailto:sekharan@us.ibm.com] 
Sent: Saturday, April 04, 2009 4:31 PM
To: dm-devel; Linux SCSI Mailing list
Cc: Moger, Babu
Subject: SCSI Hardware Handler and slow failover with large number of LUNS

Hello All,

During testing with the latest SCSI DH Handler on a rdac storage, Babu
found that the failover time with 100+ luns takes about 15 minutes,
which is not good.

We found that the problem is due to the fact that we serialize activate
in dm on the work queue.

We can solve the problem in rdac handler in 2 ways
 1. batch up the activates (mode_selects) and send few of them.
 2. Do mode selects in async mode.

Just wondering if anybody had seen the same problem in other storages
(EMC, HP and Alua). 

Please share your experiences, so we can come up with a solution that
works for all hardware handlers.

regards,

chandra

^ permalink raw reply	[flat|nested] 9+ messages in thread

* RE: SCSI Hardware Handler and slow failover with large number of LUNS
  2009-04-04 21:30 SCSI Hardware Handler and slow failover with large number of LUNS Chandra Seetharaman
  2009-04-06 14:44 ` Moger, Babu
@ 2009-04-06 15:38 ` berthiaume_wayne
  2009-04-06 17:47   ` Chandra Seetharaman
  2009-04-06 15:43 ` SCSI Hardware Handler and slow failover with large number of LUNS Mike Christie
  2 siblings, 1 reply; 9+ messages in thread
From: berthiaume_wayne @ 2009-04-06 15:38 UTC (permalink / raw)
  To: sekharan, dm-devel, linux-scsi; +Cc: Babu.Moger

Hi Chandra.

	We're currently testing 1024 LUNs across four paths with the
older handler but will be doing the same with the new handler later this
month. It sounds like I should move this up in the queue. 

Regards,
Wayne. 

-----Original Message-----
From: linux-scsi-owner@vger.kernel.org
[mailto:linux-scsi-owner@vger.kernel.org] On Behalf Of Chandra
Seetharaman
Sent: Saturday, April 04, 2009 5:31 PM
To: dm-devel; Linux SCSI Mailing list
Cc: Moger, Babu
Subject: SCSI Hardware Handler and slow failover with large number of
LUNS

Hello All,

During testing with the latest SCSI DH Handler on a rdac storage, Babu
found that the failover time with 100+ luns takes about 15 minutes,
which is not good.

We found that the problem is due to the fact that we serialize activate
in dm on the work queue.

We can solve the problem in rdac handler in 2 ways
 1. batch up the activates (mode_selects) and send few of them.
 2. Do mode selects in async mode.

Just wondering if anybody had seen the same problem in other storages
(EMC, HP and Alua). 

Please share your experiences, so we can come up with a solution that
works for all hardware handlers.

regards,

chandra

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: SCSI Hardware Handler and slow failover with large number of LUNS
  2009-04-04 21:30 SCSI Hardware Handler and slow failover with large number of LUNS Chandra Seetharaman
  2009-04-06 14:44 ` Moger, Babu
  2009-04-06 15:38 ` berthiaume_wayne
@ 2009-04-06 15:43 ` Mike Christie
  2009-04-06 18:21   ` [dm-devel] " Chandra Seetharaman
  2 siblings, 1 reply; 9+ messages in thread
From: Mike Christie @ 2009-04-06 15:43 UTC (permalink / raw)
  To: sekharan, device-mapper development; +Cc: Linux SCSI Mailing list, Moger, Babu

Chandra Seetharaman wrote:
> Hello All,
> 
> During testing with the latest SCSI DH Handler on a rdac storage, Babu
> found that the failover time with 100+ luns takes about 15 minutes,
> which is not good.
> 
> We found that the problem is due to the fact that we serialize activate
> in dm on the work queue.
> 

I thought we talked about this during the review?

> We can solve the problem in rdac handler in 2 ways
>  1. batch up the activates (mode_selects) and send few of them.
>  2. Do mode selects in async mode.

I think most of the ugliness in the original async mode was due to 
trying to use the REQ_BLOCK* path. With the scsi_dh_activate path, it 
should now be easier because in the send path we do not have to worry 
about queue locks being held and context.

I think we could just use blk_execute_rq_nowait to send the IO. Then we 
would have a workqueue/thread per something (maybe per dh module I 
thought), that would be queued/notified when the IO completed. The 
thread could then process the IO and handle the next stage if needed.

Why use the thread you might wonder? I think it fixes another issue with 
the original async mode, and makes it easier if the scsi_dh module has 
to send more IO. When using the thread it would not have to worry about 
the queue_lock being held in the IO completion path and does not have to 
worry about being run from more restrictive contexts.

> 
> Just wondering if anybody had seen the same problem in other storages
> (EMC, HP and Alua). 

They should all have the same problem.

> 
> Please share your experiences, so we can come up with a solution that
> works for all hardware handlers.
> 
> regards,
> 
> chandra
> 
> --
> dm-devel mailing list
> dm-devel@redhat.com
> https://www.redhat.com/mailman/listinfo/dm-devel

^ permalink raw reply	[flat|nested] 9+ messages in thread

* RE: SCSI Hardware Handler and slow failover with large number of LUNS
  2009-04-06 15:38 ` berthiaume_wayne
@ 2009-04-06 17:47   ` Chandra Seetharaman
  2009-04-06 17:59     ` SCSI Hardware Handler and slow failover with large number ofLUNS berthiaume_wayne
  0 siblings, 1 reply; 9+ messages in thread
From: Chandra Seetharaman @ 2009-04-06 17:47 UTC (permalink / raw)
  To: berthiaume_wayne; +Cc: dm-devel, linux-scsi, Babu.Moger


On Mon, 2009-04-06 at 11:38 -0400, berthiaume_wayne@emc.com wrote:
> Hi Chandra.
> 
> 	We're currently testing 1024 LUNs across four paths with the
> older handler but will be doing the same with the new handler later this

Hi Wayne,

What do you mean by "older" handler ? dm_emc ?

> month. It sounds like I should move this up in the queue. 
> 
> Regards,
> Wayne. 
> 
> -----Original Message-----
> From: linux-scsi-owner@vger.kernel.org
> [mailto:linux-scsi-owner@vger.kernel.org] On Behalf Of Chandra
> Seetharaman
> Sent: Saturday, April 04, 2009 5:31 PM
> To: dm-devel; Linux SCSI Mailing list
> Cc: Moger, Babu
> Subject: SCSI Hardware Handler and slow failover with large number of
> LUNS
> 
> Hello All,
> 
> During testing with the latest SCSI DH Handler on a rdac storage, Babu
> found that the failover time with 100+ luns takes about 15 minutes,
> which is not good.
> 
> We found that the problem is due to the fact that we serialize activate
> in dm on the work queue.
> 
> We can solve the problem in rdac handler in 2 ways
>  1. batch up the activates (mode_selects) and send few of them.
>  2. Do mode selects in async mode.
> 
> Just wondering if anybody had seen the same problem in other storages
> (EMC, HP and Alua). 
> 
> Please share your experiences, so we can come up with a solution that
> works for all hardware handlers.
> 
> regards,
> 
> chandra
> 
> --
> To unsubscribe from this list: send the line "unsubscribe linux-scsi" in
> the body of a message to majordomo@vger.kernel.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html
> 
> --
> To unsubscribe from this list: send the line "unsubscribe linux-scsi" in
> the body of a message to majordomo@vger.kernel.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply	[flat|nested] 9+ messages in thread

* RE: SCSI Hardware Handler and slow failover with large number ofLUNS
  2009-04-06 17:47   ` Chandra Seetharaman
@ 2009-04-06 17:59     ` berthiaume_wayne
  0 siblings, 0 replies; 9+ messages in thread
From: berthiaume_wayne @ 2009-04-06 17:59 UTC (permalink / raw)
  To: sekharan; +Cc: dm-devel, linux-scsi, Babu.Moger

Hi Chandra.

	Yes, it is the dm_emc.

Regards,
Wayne. 

-----Original Message-----
From: Chandra Seetharaman [mailto:sekharan@us.ibm.com] 
Sent: Monday, April 06, 2009 1:47 PM
To: berthiaume, wayne
Cc: dm-devel@redhat.com; linux-scsi@vger.kernel.org; Babu.Moger@lsi.com
Subject: RE: SCSI Hardware Handler and slow failover with large number
ofLUNS


On Mon, 2009-04-06 at 11:38 -0400, berthiaume_wayne@emc.com wrote:
> Hi Chandra.
> 
> 	We're currently testing 1024 LUNs across four paths with the
> older handler but will be doing the same with the new handler later
this

Hi Wayne,

What do you mean by "older" handler ? dm_emc ?

> month. It sounds like I should move this up in the queue. 
> 
> Regards,
> Wayne. 
> 
> -----Original Message-----
> From: linux-scsi-owner@vger.kernel.org
> [mailto:linux-scsi-owner@vger.kernel.org] On Behalf Of Chandra
> Seetharaman
> Sent: Saturday, April 04, 2009 5:31 PM
> To: dm-devel; Linux SCSI Mailing list
> Cc: Moger, Babu
> Subject: SCSI Hardware Handler and slow failover with large number of
> LUNS
> 
> Hello All,
> 
> During testing with the latest SCSI DH Handler on a rdac storage, Babu
> found that the failover time with 100+ luns takes about 15 minutes,
> which is not good.
> 
> We found that the problem is due to the fact that we serialize
activate
> in dm on the work queue.
> 
> We can solve the problem in rdac handler in 2 ways
>  1. batch up the activates (mode_selects) and send few of them.
>  2. Do mode selects in async mode.
> 
> Just wondering if anybody had seen the same problem in other storages
> (EMC, HP and Alua). 
> 
> Please share your experiences, so we can come up with a solution that
> works for all hardware handlers.
> 
> regards,
> 
> chandra
> 
> --
> To unsubscribe from this list: send the line "unsubscribe linux-scsi"
in
> the body of a message to majordomo@vger.kernel.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html
> 
> --
> To unsubscribe from this list: send the line "unsubscribe linux-scsi"
in
> the body of a message to majordomo@vger.kernel.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: [dm-devel] SCSI Hardware Handler and slow failover with large number of LUNS
  2009-04-06 15:43 ` SCSI Hardware Handler and slow failover with large number of LUNS Mike Christie
@ 2009-04-06 18:21   ` Chandra Seetharaman
  2009-04-06 18:54     ` Mike Christie
  0 siblings, 1 reply; 9+ messages in thread
From: Chandra Seetharaman @ 2009-04-06 18:21 UTC (permalink / raw)
  To: Mike Christie
  Cc: device-mapper development, Linux SCSI Mailing list, Moger, Babu

Thanks for the response Mike.

On Mon, 2009-04-06 at 10:43 -0500, Mike Christie wrote:
> Chandra Seetharaman wrote:
> > Hello All,
> > 
> > During testing with the latest SCSI DH Handler on a rdac storage, Babu
> > found that the failover time with 100+ luns takes about 15 minutes,
> > which is not good.
> > 
> > We found that the problem is due to the fact that we serialize activate
> > in dm on the work queue.
> > 
> 
> I thought we talked about this during the review?

Yes, we did and the results were compared to the virgin code (w.r.t rdac
handler) and the results were good (also I used only 49 luns) :
http://marc.info/?l=dm-devel&m=120889858019762&w=2


> 
> > We can solve the problem in rdac handler in 2 ways
> >  1. batch up the activates (mode_selects) and send few of them.
> >  2. Do mode selects in async mode.
> 
> I think most of the ugliness in the original async mode was due to 
> trying to use the REQ_BLOCK* path. With the scsi_dh_activate path, it 
> should now be easier because in the send path we do not have to worry 
> about queue locks being held and context.
> 

little confused... we still are using REQ_TYPE_BLOCK_PC

> I think we could just use blk_execute_rq_nowait to send the IO. Then we 
> would have a workqueue/thread per something (maybe per dh module I 
> thought), that would be queued/notified when the IO completed. The 
> thread could then process the IO and handle the next stage if needed.
> 
> Why use the thread you might wonder? I think it fixes another issue with 
> the original async mode, and makes it easier if the scsi_dh module has 

can you elaborate the issue ?

> to send more IO. When using the thread it would not have to worry about 
> the queue_lock being held in the IO completion path and does not have to 
> worry about being run from more restrictive contexts.

You think queue_lock contention is an issue ?

I agree with the restrictive context issue though.

So, your suggestion is to move everything to async ?

> 
> 
> > 
> > Just wondering if anybody had seen the same problem in other storages
> > (EMC, HP and Alua). 
> 
> They should all have the same problem.
> 
> 
> > 
> > Please share your experiences, so we can come up with a solution that
> > works for all hardware handlers.
> > 
> > regards,
> > 
> > chandra
> > 
> > --
> > dm-devel mailing list
> > dm-devel@redhat.com
> > https://www.redhat.com/mailman/listinfo/dm-devel
> 


^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: SCSI Hardware Handler and slow failover with large number of LUNS
  2009-04-06 18:21   ` [dm-devel] " Chandra Seetharaman
@ 2009-04-06 18:54     ` Mike Christie
  2009-04-06 20:09       ` [dm-devel] " Chandra Seetharaman
  0 siblings, 1 reply; 9+ messages in thread
From: Mike Christie @ 2009-04-06 18:54 UTC (permalink / raw)
  To: sekharan; +Cc: device-mapper development, Linux SCSI Mailing list, Moger, Babu

Chandra Seetharaman wrote:
> Thanks for the response Mike.
> 
> On Mon, 2009-04-06 at 10:43 -0500, Mike Christie wrote:
>> Chandra Seetharaman wrote:
>>> Hello All,
>>>
>>> During testing with the latest SCSI DH Handler on a rdac storage, Babu
>>> found that the failover time with 100+ luns takes about 15 minutes,
>>> which is not good.
>>>
>>> We found that the problem is due to the fact that we serialize activate
>>> in dm on the work queue.
>>>
>> I thought we talked about this during the review?
> 
> Yes, we did and the results were compared to the virgin code (w.r.t rdac
> handler) and the results were good (also I used only 49 luns) :
> http://marc.info/?l=dm-devel&m=120889858019762&w=2
> 
> 
>>> We can solve the problem in rdac handler in 2 ways
>>>  1. batch up the activates (mode_selects) and send few of them.
>>>  2. Do mode selects in async mode.
>> I think most of the ugliness in the original async mode was due to 
>> trying to use the REQ_BLOCK* path. With the scsi_dh_activate path, it 
>> should now be easier because in the send path we do not have to worry 
>> about queue locks being held and context.
>>
> 
> little confused... we still are using REQ_TYPE_BLOCK_PC
> 

But we only have one level of requests. I am talking about when we tried 
to send a request with REQ_BLOCK_LINUX_BLOCK to the module to tell it to 
send another request/s with REQ_TYPE_BLOCK_PC. Now we just have the 
callout and then like you said we can fire REQ_TYPE_BLOCK_PC reuqests 
from there.

I think when I wrote easier above, I meant to write a cleaner 
implementation.



>> I think we could just use blk_execute_rq_nowait to send the IO. Then we 
>> would have a workqueue/thread per something (maybe per dh module I 
>> thought), that would be queued/notified when the IO completed. The 
>> thread could then process the IO and handle the next stage if needed.
>>
>> Why use the thread you might wonder? I think it fixes another issue with 
>> the original async mode, and makes it easier if the scsi_dh module has 
> 
> can you elaborate the issue ?


I think people did not like the complexity of trying to send IO with 
soft irq context with spin locks held, then also having the extra 
REQ_BLOCK_LINUX_BLOCK layering.


> 
>> to send more IO. When using the thread it would not have to worry about 
>> the queue_lock being held in the IO completion path and does not have to 
>> worry about being run from more restrictive contexts.
> 
> You think queue_lock contention is an issue ?
> 
> I agree with the restrictive context issue though.
> 
> So, your suggestion is to move everything to async ?
> 

Do mean vs #1 or would you want to seperate and send some stuff async 
and synchronously?

>>
>>> Just wondering if anybody had seen the same problem in other storages
>>> (EMC, HP and Alua). 
>> They should all have the same problem.
>>
>>
>>> Please share your experiences, so we can come up with a solution that
>>> works for all hardware handlers.
>>>
>>> regards,
>>>
>>> chandra
>>>
>>> --
>>> dm-devel mailing list
>>> dm-devel@redhat.com
>>> https://www.redhat.com/mailman/listinfo/dm-devel
> 
> --
> To unsubscribe from this list: send the line "unsubscribe linux-scsi" in
> the body of a message to majordomo@vger.kernel.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: [dm-devel] SCSI Hardware Handler and slow failover with large number of LUNS
  2009-04-06 18:54     ` Mike Christie
@ 2009-04-06 20:09       ` Chandra Seetharaman
  0 siblings, 0 replies; 9+ messages in thread
From: Chandra Seetharaman @ 2009-04-06 20:09 UTC (permalink / raw)
  To: Mike Christie
  Cc: device-mapper development, Linux SCSI Mailing list, Moger, Babu


On Mon, 2009-04-06 at 13:54 -0500, Mike Christie wrote:
> > 
> > 
> >>> We can solve the problem in rdac handler in 2 ways
> >>>  1. batch up the activates (mode_selects) and send few of them.
> >>>  2. Do mode selects in async mode.
> >> I think most of the ugliness in the original async mode was due to 
> >> trying to use the REQ_BLOCK* path. With the scsi_dh_activate path, it 
> >> should now be easier because in the send path we do not have to worry 
> >> about queue locks being held and context.
> >>
> > 
> > little confused... we still are using REQ_TYPE_BLOCK_PC
> > 
> 
> But we only have one level of requests. I am talking about when we tried 
> to send a request with REQ_BLOCK_LINUX_BLOCK to the module to tell it to 
> send another request/s with REQ_TYPE_BLOCK_PC. Now we just have the 
> callout and then like you said we can fire REQ_TYPE_BLOCK_PC reuqests 
> from there.
> 
> I think when I wrote easier above, I meant to write a cleaner 
> implementation.

Now I understand.

> 
> 
> 
> >> I think we could just use blk_execute_rq_nowait to send the IO. Then we 
> >> would have a workqueue/thread per something (maybe per dh module I 
> >> thought), that would be queued/notified when the IO completed. The 
> >> thread could then process the IO and handle the next stage if needed.
> >>
> >> Why use the thread you might wonder? I think it fixes another issue with 
> >> the original async mode, and makes it easier if the scsi_dh module has 
> > 
> > can you elaborate the issue ?
> 
> 
> I think people did not like the complexity of trying to send IO with 
> soft irq context with spin locks held, then also having the extra 
> REQ_BLOCK_LINUX_BLOCK layering.

clear now :)

> 
> 
> > 
> >> to send more IO. When using the thread it would not have to worry about 
> >> the queue_lock being held in the IO completion path and does not have to 
> >> worry about being run from more restrictive contexts.
> > 
> > You think queue_lock contention is an issue ?
> > 
> > I agree with the restrictive context issue though.
> > 
> > So, your suggestion is to move everything to async ?
> > 
> 
> Do mean vs #1 or would you want to seperate and send some stuff async 
> and synchronously?

The other option I was thinking about was to utilize the capability of
the underlying device. For example, for lsi rdac, we can batch up mode
selects and send fewer commands down the wire, which would speed things
up.

I understand that not all devices would have a feature like rdac.
Nevertheless, it doesn't matter as this will be _inside_ the hardware
handler and they _don't_ have to be the same solution.

If async is the best option for all other handler, we can as well do
async for rdac also, which will keep the handlers looking the same :)

> 
> >>
> >>> Just wondering if anybody had seen the same problem in other storages
> >>> (EMC, HP and Alua). 
> >> They should all have the same problem.
> >>
> >>
> >>> Please share your experiences, so we can come up with a solution that
> >>> works for all hardware handlers.
> >>>
> >>> regards,
> >>>
> >>> chandra
> >>>
> >>> --
> >>> dm-devel mailing list
> >>> dm-devel@redhat.com
> >>> https://www.redhat.com/mailman/listinfo/dm-devel
> > 
> > --
> > To unsubscribe from this list: send the line "unsubscribe linux-scsi" in
> > the body of a message to majordomo@vger.kernel.org
> > More majordomo info at  http://vger.kernel.org/majordomo-info.html
> 
> --
> To unsubscribe from this list: send the line "unsubscribe linux-scsi" in
> the body of a message to majordomo@vger.kernel.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html


^ permalink raw reply	[flat|nested] 9+ messages in thread

end of thread, other threads:[~2009-04-06 20:09 UTC | newest]

Thread overview: 9+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2009-04-04 21:30 SCSI Hardware Handler and slow failover with large number of LUNS Chandra Seetharaman
2009-04-06 14:44 ` Moger, Babu
2009-04-06 15:38 ` berthiaume_wayne
2009-04-06 17:47   ` Chandra Seetharaman
2009-04-06 17:59     ` SCSI Hardware Handler and slow failover with large number ofLUNS berthiaume_wayne
2009-04-06 15:43 ` SCSI Hardware Handler and slow failover with large number of LUNS Mike Christie
2009-04-06 18:21   ` [dm-devel] " Chandra Seetharaman
2009-04-06 18:54     ` Mike Christie
2009-04-06 20:09       ` [dm-devel] " Chandra Seetharaman

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).