linux-scsi.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* Question about iscsi session block
@ 2022-02-15 15:49 Zhengyuan Liu
  2022-02-15 16:31 ` Mike Christie
  0 siblings, 1 reply; 6+ messages in thread
From: Zhengyuan Liu @ 2022-02-15 15:49 UTC (permalink / raw)
  To: linux-scsi, open-iscsi, dm-devel; +Cc: lduncan, leech, bob.liu

Hi, all

We have an online server which uses multipath + iscsi to attach storage
from Storage Server. There are two NICs on the server and for each it
carries about 20 iscsi sessions and for each session it includes about 50
 iscsi devices (yes, there are totally about 2*20*50=2000 iscsi block devices
 on the server). The problem is: once a NIC gets faulted, it will take too long
(nearly 80s) for multipath to switch to another good NIC link, because it
needs to block all iscsi devices over that faulted NIC firstly. The callstack is
 shown below:

    void iscsi_block_session(struct iscsi_cls_session *session)
    {
        queue_work(iscsi_eh_timer_workq, &session->block_work);
    }

 __iscsi_block_session() -> scsi_target_block() -> target_block() ->
  device_block() ->  scsi_internal_device_block() -> scsi_stop_queue() ->
 blk_mq_quiesce_queue()>synchronize_rcu()

For all sessions and all devices, it was processed sequentially, and we have
traced that for each synchronize_rcu() call it takes about 80ms, so
the total cost
is about 80s (80ms * 20 * 50). It's so long that the application can't
tolerate and
may interrupt service.

So my question is that can we optimize the procedure to reduce the time cost on
blocking all iscsi devices?  I'm not sure if it is a good idea to increase the
workqueue's max_active of iscsi_eh_timer_workq to improve concurrency.

Thanks in advance.

^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: Question about iscsi session block
  2022-02-15 15:49 Question about iscsi session block Zhengyuan Liu
@ 2022-02-15 16:31 ` Mike Christie
  2022-02-16  1:28   ` Zhengyuan Liu
  0 siblings, 1 reply; 6+ messages in thread
From: Mike Christie @ 2022-02-15 16:31 UTC (permalink / raw)
  To: Zhengyuan Liu, linux-scsi, open-iscsi, dm-devel; +Cc: lduncan, leech

On 2/15/22 9:49 AM, Zhengyuan Liu wrote:
> Hi, all
> 
> We have an online server which uses multipath + iscsi to attach storage
> from Storage Server. There are two NICs on the server and for each it
> carries about 20 iscsi sessions and for each session it includes about 50
>  iscsi devices (yes, there are totally about 2*20*50=2000 iscsi block devices
>  on the server). The problem is: once a NIC gets faulted, it will take too long
> (nearly 80s) for multipath to switch to another good NIC link, because it
> needs to block all iscsi devices over that faulted NIC firstly. The callstack is
>  shown below:
> 
>     void iscsi_block_session(struct iscsi_cls_session *session)
>     {
>         queue_work(iscsi_eh_timer_workq, &session->block_work);
>     }
> 
>  __iscsi_block_session() -> scsi_target_block() -> target_block() ->
>   device_block() ->  scsi_internal_device_block() -> scsi_stop_queue() ->
>  blk_mq_quiesce_queue()>synchronize_rcu()
> 
> For all sessions and all devices, it was processed sequentially, and we have
> traced that for each synchronize_rcu() call it takes about 80ms, so
> the total cost
> is about 80s (80ms * 20 * 50). It's so long that the application can't
> tolerate and
> may interrupt service.
> 
> So my question is that can we optimize the procedure to reduce the time cost on
> blocking all iscsi devices?  I'm not sure if it is a good idea to increase the
> workqueue's max_active of iscsi_eh_timer_workq to improve concurrency.

We need a patch, so the unblock call waits/cancels/flushes the block call or
they could be running in parallel.

I'll send a patchset later today so you can test it.

^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: Question about iscsi session block
  2022-02-15 16:31 ` Mike Christie
@ 2022-02-16  1:28   ` Zhengyuan Liu
  2022-02-16  2:19     ` michael.christie
  0 siblings, 1 reply; 6+ messages in thread
From: Zhengyuan Liu @ 2022-02-16  1:28 UTC (permalink / raw)
  To: Mike Christie; +Cc: linux-scsi, open-iscsi, dm-devel, lduncan, leech

On Wed, Feb 16, 2022 at 12:31 AM Mike Christie
<michael.christie@oracle.com> wrote:
>
> On 2/15/22 9:49 AM, Zhengyuan Liu wrote:
> > Hi, all
> >
> > We have an online server which uses multipath + iscsi to attach storage
> > from Storage Server. There are two NICs on the server and for each it
> > carries about 20 iscsi sessions and for each session it includes about 50
> >  iscsi devices (yes, there are totally about 2*20*50=2000 iscsi block devices
> >  on the server). The problem is: once a NIC gets faulted, it will take too long
> > (nearly 80s) for multipath to switch to another good NIC link, because it
> > needs to block all iscsi devices over that faulted NIC firstly. The callstack is
> >  shown below:
> >
> >     void iscsi_block_session(struct iscsi_cls_session *session)
> >     {
> >         queue_work(iscsi_eh_timer_workq, &session->block_work);
> >     }
> >
> >  __iscsi_block_session() -> scsi_target_block() -> target_block() ->
> >   device_block() ->  scsi_internal_device_block() -> scsi_stop_queue() ->
> >  blk_mq_quiesce_queue()>synchronize_rcu()
> >
> > For all sessions and all devices, it was processed sequentially, and we have
> > traced that for each synchronize_rcu() call it takes about 80ms, so
> > the total cost
> > is about 80s (80ms * 20 * 50). It's so long that the application can't
> > tolerate and
> > may interrupt service.
> >
> > So my question is that can we optimize the procedure to reduce the time cost on
> > blocking all iscsi devices?  I'm not sure if it is a good idea to increase the
> > workqueue's max_active of iscsi_eh_timer_workq to improve concurrency.
>
> We need a patch, so the unblock call waits/cancels/flushes the block call or
> they could be running in parallel.
>
> I'll send a patchset later today so you can test it.

I'm glad to test once you push the patchset.

Thank you, Mike.

^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: Question about iscsi session block
  2022-02-16  1:28   ` Zhengyuan Liu
@ 2022-02-16  2:19     ` michael.christie
  2022-02-26 23:00       ` Mike Christie
  0 siblings, 1 reply; 6+ messages in thread
From: michael.christie @ 2022-02-16  2:19 UTC (permalink / raw)
  To: Zhengyuan Liu; +Cc: linux-scsi, open-iscsi, dm-devel, lduncan, leech

On 2/15/22 7:28 PM, Zhengyuan Liu wrote:
> On Wed, Feb 16, 2022 at 12:31 AM Mike Christie
> <michael.christie@oracle.com> wrote:
>>
>> On 2/15/22 9:49 AM, Zhengyuan Liu wrote:
>>> Hi, all
>>>
>>> We have an online server which uses multipath + iscsi to attach storage
>>> from Storage Server. There are two NICs on the server and for each it
>>> carries about 20 iscsi sessions and for each session it includes about 50
>>>  iscsi devices (yes, there are totally about 2*20*50=2000 iscsi block devices
>>>  on the server). The problem is: once a NIC gets faulted, it will take too long
>>> (nearly 80s) for multipath to switch to another good NIC link, because it
>>> needs to block all iscsi devices over that faulted NIC firstly. The callstack is
>>>  shown below:
>>>
>>>     void iscsi_block_session(struct iscsi_cls_session *session)
>>>     {
>>>         queue_work(iscsi_eh_timer_workq, &session->block_work);
>>>     }
>>>
>>>  __iscsi_block_session() -> scsi_target_block() -> target_block() ->
>>>   device_block() ->  scsi_internal_device_block() -> scsi_stop_queue() ->
>>>  blk_mq_quiesce_queue()>synchronize_rcu()
>>>
>>> For all sessions and all devices, it was processed sequentially, and we have
>>> traced that for each synchronize_rcu() call it takes about 80ms, so
>>> the total cost
>>> is about 80s (80ms * 20 * 50). It's so long that the application can't
>>> tolerate and
>>> may interrupt service.
>>>
>>> So my question is that can we optimize the procedure to reduce the time cost on
>>> blocking all iscsi devices?  I'm not sure if it is a good idea to increase the
>>> workqueue's max_active of iscsi_eh_timer_workq to improve concurrency.
>>
>> We need a patch, so the unblock call waits/cancels/flushes the block call or
>> they could be running in parallel.
>>
>> I'll send a patchset later today so you can test it.
> 
> I'm glad to test once you push the patchset.
> 
> Thank you, Mike.

I forgot I did this recently :)

commit 7ce9fc5ecde0d8bd64c29baee6c5e3ce7074ec9a
Author: Mike Christie <michael.christie@oracle.com>
Date:   Tue May 25 13:18:09 2021 -0500

    scsi: iscsi: Flush block work before unblock
    
    We set the max_active iSCSI EH works to 1, so all work is going to execute
    in order by default. However, userspace can now override this in sysfs. If
    max_active > 1, we can end up with the block_work on CPU1 and
    iscsi_unblock_session running the unblock_work on CPU2 and the session and
    target/device state will end up out of sync with each other.
    
    This adds a flush of the block_work in iscsi_unblock_session.


It was merged in 5.14.

^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: Question about iscsi session block
  2022-02-16  2:19     ` michael.christie
@ 2022-02-26 23:00       ` Mike Christie
  2022-05-24  6:29         ` Zhengyuan Liu
  0 siblings, 1 reply; 6+ messages in thread
From: Mike Christie @ 2022-02-26 23:00 UTC (permalink / raw)
  To: Zhengyuan Liu; +Cc: linux-scsi, open-iscsi, dm-devel, lduncan, leech

On 2/15/22 8:19 PM, michael.christie@oracle.com wrote:
> On 2/15/22 7:28 PM, Zhengyuan Liu wrote:
>> On Wed, Feb 16, 2022 at 12:31 AM Mike Christie
>> <michael.christie@oracle.com> wrote:
>>>
>>> On 2/15/22 9:49 AM, Zhengyuan Liu wrote:
>>>> Hi, all
>>>>
>>>> We have an online server which uses multipath + iscsi to attach storage
>>>> from Storage Server. There are two NICs on the server and for each it
>>>> carries about 20 iscsi sessions and for each session it includes about 50
>>>>  iscsi devices (yes, there are totally about 2*20*50=2000 iscsi block devices
>>>>  on the server). The problem is: once a NIC gets faulted, it will take too long
>>>> (nearly 80s) for multipath to switch to another good NIC link, because it
>>>> needs to block all iscsi devices over that faulted NIC firstly. The callstack is
>>>>  shown below:
>>>>
>>>>     void iscsi_block_session(struct iscsi_cls_session *session)
>>>>     {
>>>>         queue_work(iscsi_eh_timer_workq, &session->block_work);
>>>>     }
>>>>
>>>>  __iscsi_block_session() -> scsi_target_block() -> target_block() ->
>>>>   device_block() ->  scsi_internal_device_block() -> scsi_stop_queue() ->
>>>>  blk_mq_quiesce_queue()>synchronize_rcu()
>>>>
>>>> For all sessions and all devices, it was processed sequentially, and we have
>>>> traced that for each synchronize_rcu() call it takes about 80ms, so
>>>> the total cost
>>>> is about 80s (80ms * 20 * 50). It's so long that the application can't
>>>> tolerate and
>>>> may interrupt service.
>>>>
>>>> So my question is that can we optimize the procedure to reduce the time cost on
>>>> blocking all iscsi devices?  I'm not sure if it is a good idea to increase the
>>>> workqueue's max_active of iscsi_eh_timer_workq to improve concurrency.
>>>
>>> We need a patch, so the unblock call waits/cancels/flushes the block call or
>>> they could be running in parallel.
>>>
>>> I'll send a patchset later today so you can test it.
>>
>> I'm glad to test once you push the patchset.
>>
>> Thank you, Mike.
> 
> I forgot I did this recently :)
> 
> commit 7ce9fc5ecde0d8bd64c29baee6c5e3ce7074ec9a
> Author: Mike Christie <michael.christie@oracle.com>
> Date:   Tue May 25 13:18:09 2021 -0500
> 
>     scsi: iscsi: Flush block work before unblock
>     
>     We set the max_active iSCSI EH works to 1, so all work is going to execute
>     in order by default. However, userspace can now override this in sysfs. If
>     max_active > 1, we can end up with the block_work on CPU1 and
>     iscsi_unblock_session running the unblock_work on CPU2 and the session and
>     target/device state will end up out of sync with each other.
>     
>     This adds a flush of the block_work in iscsi_unblock_session.
> 
> 
> It was merged in 5.14.

Hey, I found one more bug when max_active > 1. While fixing it I decided to just
fix this so we can do the sessions recoveries in parallel and the user doesn't have
to worry about setting max_active.

I'll send a patchset and cc you.

^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: Question about iscsi session block
  2022-02-26 23:00       ` Mike Christie
@ 2022-05-24  6:29         ` Zhengyuan Liu
  0 siblings, 0 replies; 6+ messages in thread
From: Zhengyuan Liu @ 2022-05-24  6:29 UTC (permalink / raw)
  To: Mike Christie; +Cc: linux-scsi, open-iscsi, dm-devel, lduncan, leech

Hi, Mike,

Sorry for the delayed reply since I have no  environment to check your
bellow patcheset untile recently

https://lore.kernel.org/all/20220226230435.38733-1-michael.christie@oracle.com/

After applied those series, the total time has dropped from 80s to
nearly 10s, it's a great improvement.

Thanks, again

On Sun, Feb 27, 2022 at 7:00 AM Mike Christie
<michael.christie@oracle.com> wrote:
>
> On 2/15/22 8:19 PM, michael.christie@oracle.com wrote:
> > On 2/15/22 7:28 PM, Zhengyuan Liu wrote:
> >> On Wed, Feb 16, 2022 at 12:31 AM Mike Christie
> >> <michael.christie@oracle.com> wrote:
> >>>
> >>> On 2/15/22 9:49 AM, Zhengyuan Liu wrote:
> >>>> Hi, all
> >>>>
> >>>> We have an online server which uses multipath + iscsi to attach storage
> >>>> from Storage Server. There are two NICs on the server and for each it
> >>>> carries about 20 iscsi sessions and for each session it includes about 50
> >>>>  iscsi devices (yes, there are totally about 2*20*50=2000 iscsi block devices
> >>>>  on the server). The problem is: once a NIC gets faulted, it will take too long
> >>>> (nearly 80s) for multipath to switch to another good NIC link, because it
> >>>> needs to block all iscsi devices over that faulted NIC firstly. The callstack is
> >>>>  shown below:
> >>>>
> >>>>     void iscsi_block_session(struct iscsi_cls_session *session)
> >>>>     {
> >>>>         queue_work(iscsi_eh_timer_workq, &session->block_work);
> >>>>     }
> >>>>
> >>>>  __iscsi_block_session() -> scsi_target_block() -> target_block() ->
> >>>>   device_block() ->  scsi_internal_device_block() -> scsi_stop_queue() ->
> >>>>  blk_mq_quiesce_queue()>synchronize_rcu()
> >>>>
> >>>> For all sessions and all devices, it was processed sequentially, and we have
> >>>> traced that for each synchronize_rcu() call it takes about 80ms, so
> >>>> the total cost
> >>>> is about 80s (80ms * 20 * 50). It's so long that the application can't
> >>>> tolerate and
> >>>> may interrupt service.
> >>>>
> >>>> So my question is that can we optimize the procedure to reduce the time cost on
> >>>> blocking all iscsi devices?  I'm not sure if it is a good idea to increase the
> >>>> workqueue's max_active of iscsi_eh_timer_workq to improve concurrency.
> >>>
> >>> We need a patch, so the unblock call waits/cancels/flushes the block call or
> >>> they could be running in parallel.
> >>>
> >>> I'll send a patchset later today so you can test it.
> >>
> >> I'm glad to test once you push the patchset.
> >>
> >> Thank you, Mike.
> >
> > I forgot I did this recently :)
> >
> > commit 7ce9fc5ecde0d8bd64c29baee6c5e3ce7074ec9a
> > Author: Mike Christie <michael.christie@oracle.com>
> > Date:   Tue May 25 13:18:09 2021 -0500
> >
> >     scsi: iscsi: Flush block work before unblock
> >
> >     We set the max_active iSCSI EH works to 1, so all work is going to execute
> >     in order by default. However, userspace can now override this in sysfs. If
> >     max_active > 1, we can end up with the block_work on CPU1 and
> >     iscsi_unblock_session running the unblock_work on CPU2 and the session and
> >     target/device state will end up out of sync with each other.
> >
> >     This adds a flush of the block_work in iscsi_unblock_session.
> >
> >
> > It was merged in 5.14.
>
> Hey, I found one more bug when max_active > 1. While fixing it I decided to just
> fix this so we can do the sessions recoveries in parallel and the user doesn't have
> to worry about setting max_active.
>
> I'll send a patchset and cc you.

^ permalink raw reply	[flat|nested] 6+ messages in thread

end of thread, other threads:[~2022-05-24  6:29 UTC | newest]

Thread overview: 6+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2022-02-15 15:49 Question about iscsi session block Zhengyuan Liu
2022-02-15 16:31 ` Mike Christie
2022-02-16  1:28   ` Zhengyuan Liu
2022-02-16  2:19     ` michael.christie
2022-02-26 23:00       ` Mike Christie
2022-05-24  6:29         ` Zhengyuan Liu

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).