From mboxrd@z Thu Jan 1 00:00:00 1970 From: jianchao.w.wang@oracle.com (jianchao.wang) Date: Fri, 9 Feb 2018 09:41:59 +0800 Subject: [PATCH 2/6] nvme-pci: fix the freeze and quiesce for shutdown and reset case In-Reply-To: <20180208151508.GA4797@localhost.localdomain> References: <20180202182413.GH24417@localhost.localdomain> <20180205151314.GP24417@localhost.localdomain> <20180206151335.GE31110@localhost.localdomain> <20180207161345.GB1337@localhost.localdomain> <1826ebc1-d419-23da-12d4-dd7b1b3fe598@oracle.com> <958cae59-1a01-d60f-822b-cf81cfa31b8f@oracle.com> <20180208151508.GA4797@localhost.localdomain> Message-ID: Hi Keith Thanks for your precious time and kindly response. On 02/08/2018 11:15 PM, Keith Busch wrote: > On Thu, Feb 08, 2018@10:17:00PM +0800, jianchao.wang wrote: >> There is a dangerous scenario which caused by nvme_wait_freeze in nvme_reset_work. >> please consider it. >> >> nvme_reset_work >> -> nvme_start_queues >> -> nvme_wait_freeze >> >> if the controller no response, we have to rely on the timeout path. >> there are issues below: >> nvme_dev_disable need to be invoked. >> nvme_dev_disable will quiesce queues, cancel and requeue and outstanding requests. >> nvme_reset_work will hang at nvme_wait_freeze > > We used to not requeue timed out commands, so that wasn't a problem > before. Oh well, I'll take a look. > Yes, we indeed don't requeue the timed out commands, but nvme_dev_disable will requeue the other outstanding requests and quiesce the request queues, this will block the nvme_reset_work->nvme_wati_freeze to move forward. As I shared in last email, can we use(or abuse?) blk_set_preempt_only to gate the new bios on generic_make_request ? Freezing queues is good, but wait_freeze in reset_work is a devil. Many thanks Jianchao