linux-scsi.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* multipath_busy() stalls IO due to scsi_host_is_busy()
@ 2012-05-16 12:28 Bernd Schubert
  2012-05-16 14:06 ` James Bottomley
  2012-05-17  9:09 ` multipath_busy() stalls IO due to scsi_host_is_busy() Jun'ichi Nomura
  0 siblings, 2 replies; 15+ messages in thread
From: Bernd Schubert @ 2012-05-16 12:28 UTC (permalink / raw)
  To: dm-devel; +Cc: linux-scsi@vger.kernel.org

Hello,

while I actually want to benchmark FhGFS on a NetApp system, I'm somehow 
running from one kernel problem to another.
Yesterday we had to recable and while we are now still using multipath, 
each priority group now only has one underlying devices (we don't have 
sufficient IB srp ports on our test systems, but still want to benchmark 
a system as close as possible to a production system).
So after recabling actually all failover paths disappeared, which 
*shouldn't* have any influence on the performance. However, unexpectedly 
performance is now by less than 50% when I'm doing buffered IO. With 
direct IO it also still fine and reducing nr_requests of the multipath 
device to 8 also 'fixes' the problem. I then guessed it right and simply 
made multipath_busy() always to return 0, which also fixes the issue.


- problem:
	- iostat -x -m 1 shows that alternating one multipath devices starts to 
stall IO for several minutes
	- the other multipath device then does IO during that time with about 
600 to 700 MB/s, until it starts to stall IO
	- the active NetApp controller could server both multipath devices with 
about 600 to 700 MB/s

problem solutions:
	- add another passive sdX device to the multipath group
	- use direct IO
	- reduce /sys/block/dm-X/queue/nr_requests to 8
		- /sys/block/sdX does not need to be updated
	- disbable multipath_busy() by letting it return 0

Looking through the call chain, I see the underlying problem seems to be 
in scsi_host_is_busy().

> static inline int scsi_host_is_busy(struct Scsi_Host *shost)
> {
> 	if ((shost->can_queue > 0 && shost->host_busy >= shost->can_queue) ||
> 	    shost->host_blocked || shost->host_self_blocked)
> 		return 1;
>
> 	return 0;
> }
>

shost->can_queue -> 62 here
shost->host_busy -> 62 when one of the multipath groups does IO, further 
multipath groups then seem to get stalled.

I'm not sure yet why multipath_busy() does not stall IO when there is a 
passive path in the prio group.

Any idea how to properly address this problem?


Thanks,
Bernd


^ permalink raw reply	[flat|nested] 15+ messages in thread

end of thread, other threads:[~2012-05-30  5:24 UTC | newest]

Thread overview: 15+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2012-05-16 12:28 multipath_busy() stalls IO due to scsi_host_is_busy() Bernd Schubert
2012-05-16 14:06 ` James Bottomley
2012-05-16 14:29   ` Bernd Schubert
2012-05-16 15:27     ` [dm-devel] " Mike Christie
     [not found]       ` <4FB3C75F.3070903-hcNo3dDEHLuVc3sceRu5cw@public.gmane.org>
2012-05-16 15:54         ` Bernd Schubert
     [not found]           ` <4FB3CDC5.9040608-mPn0NPGs4xGatNDF+KUbs4QuADTiUCJX@public.gmane.org>
2012-05-16 17:03             ` David Dillow
2012-05-16 20:34               ` Bernd Schubert
2012-05-21 15:49               ` [PATCH] srp: convert SRP_RQ_SHIFT into a module parameter Bernd Schubert
     [not found]                 ` <4FBA6412.7040505-mPn0NPGs4xGatNDF+KUbs4QuADTiUCJX@public.gmane.org>
2012-05-21 19:35                   ` Bernd Schubert
2012-05-30  5:22                   ` David Dillow
     [not found]                     ` <1338355352.2361.24.camel-1q1vX8mYZiGLUyTwlgNVppKKF0rrzTr+@public.gmane.org>
2012-05-30  5:24                       ` David Dillow
2012-05-17  9:09 ` multipath_busy() stalls IO due to scsi_host_is_busy() Jun'ichi Nomura
2012-05-17 13:46   ` Mike Snitzer
2012-05-21 15:42   ` Bernd Schubert
2012-05-22  4:31     ` Jun'ichi Nomura

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).