From mboxrd@z Thu Jan 1 00:00:00 1970 From: Bart Van Assche Subject: Re: [PATCH 07/14] scsi_transport_srp: Add transport layer error handling Date: Sat, 15 Jun 2013 11:52:05 +0200 Message-ID: <51BC3945.9030900@acm.org> References: <51B87501.4070005@acm.org> <51B8777B.5050201@acm.org> <51BA20ED.6040200@mellanox.com> <51BB1857.7040802@acm.org> <51BB5A04.3080901@mellanox.com> Mime-Version: 1.0 Content-Type: text/plain; charset=ISO-8859-1; format=flowed Content-Transfer-Encoding: 7bit Return-path: In-Reply-To: <51BB5A04.3080901-VPRAkNaXOzVWk0Htik3J/w@public.gmane.org> Sender: linux-rdma-owner-u79uwXL29TY76Z2rM5mHXA@public.gmane.org To: Vu Pham Cc: Roland Dreier , David Dillow , Sebastian Riemer , linux-rdma , linux-scsi , James Bottomley List-Id: linux-rdma@vger.kernel.org On 06/14/13 19:59, Vu Pham wrote: >> On 06/13/13 21:43, Vu Pham wrote: >>>> +/** >>>> + * srp_tmo_valid() - check timeout combination validity >>>> + * >>>> + * If no fast I/O fail timeout has been configured then the device >>>> loss timeout >>>> + * must be below SCSI_DEVICE_BLOCK_MAX_TIMEOUT. If a fast I/O fail >>>> timeout has >>>> + * been configured then it must be below the device loss timeout. >>>> + */ >>>> +int srp_tmo_valid(int fast_io_fail_tmo, int dev_loss_tmo) >>>> +{ >>>> + return (fast_io_fail_tmo < 0 && 1 <= dev_loss_tmo && >>>> + dev_loss_tmo <= SCSI_DEVICE_BLOCK_MAX_TIMEOUT) >>>> + || (0 <= fast_io_fail_tmo && >>>> + (dev_loss_tmo < 0 || >>>> + (fast_io_fail_tmo < dev_loss_tmo && >>>> + dev_loss_tmo < LONG_MAX / HZ))) ? 0 : -EINVAL; >>>> +} >>>> +EXPORT_SYMBOL_GPL(srp_tmo_valid); >>> fast_io_fail_tmo is off, one cannot turn off dev_loss_tmo with >>> negative value >>> dev_loss_tmo is off, one cannot turn off fast_io_fail_tmo with >>> negative value >> >> OK, will update the documentation such that it correctly refers to >> "off" instead of a negative value and I will also mention that >> dev_loss_tmo can now be disabled. >> > It's not only the documentation but also the code logic, you cannot turn > dev_loss_tmo off if fast_io_fail_tmo already turned off and vice versa > with the return statement above. Does this mean that you think it would be useful to disable both the fast_io_fail and the dev_loss mechanisms, and hence rely on the user to remove remote ports that have disappeared and on the SCSI command timeout to detect path failures ? I'll start testing this to see whether that combination does not trigger any adverse behavior. >>> If rport's state is already SRP_RPORT_BLOCKED, I don't think we need >>> to do extra block with scsi_block_requests() >> >> Please keep in mind that srp_reconnect_rport() can be called from two >> different contexts: that function can not only be called from inside >> the SRP transport layer but also from inside the SCSI error handler >> (see also the srp_reset_device() modifications in a later patch in >> this series). If this function is invoked from the context of the SCSI >> error handler the chance is high that the SCSI device will have >> another state than SDEV_BLOCK. Hence the scsi_block_requests() call in >> this function. > Yes, srp_reconnect_rport() can be called from two contexts; however, it > deals with same rport & rport's state. > I'm thinking something like this: > > if (rport->state != SRP_RPORT_BLOCKED) { > scsi_block_requests(shost); Sorry but I'm afraid that that approach would still allow the user to unblock one or more SCSI devices via sysfs during the i->f->reconnect(rport) call, something we do not want. > I think that we can use only the pair > scsi_block_requests()/scsi_unblock_requests() unless the advantage of > multipathd recognizing the SDEV_BLOCK is noticeable. I think the advantage of multipathd recognizing the SDEV_BLOCK state before the fast_io_fail_tmo timer has expired is important. Multipathd does not queue I/O to paths that are in the SDEV_BLOCK state so setting that state helps I/O to fail over more quickly, especially for large values of fast_io_fail_tmo. Hope this helps, Bart. -- To unsubscribe from this list: send the line "unsubscribe linux-rdma" in the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org More majordomo info at http://vger.kernel.org/majordomo-info.html