From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from bombadil.infradead.org (bombadil.infradead.org [198.137.202.133]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.lore.kernel.org (Postfix) with ESMTPS id 322D7CFA76F for ; Fri, 4 Oct 2024 11:49:00 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; q=dns/txt; c=relaxed/relaxed; d=lists.infradead.org; s=bombadil.20210309; h=Sender:List-Subscribe:List-Help :List-Post:List-Archive:List-Unsubscribe:List-Id:Content-Transfer-Encoding: MIME-Version:References:In-Reply-To:Message-ID:Date:Subject:Cc:To:From: Reply-To:Content-Type:Content-ID:Content-Description:Resent-Date:Resent-From: Resent-Sender:Resent-To:Resent-Cc:Resent-Message-ID:List-Owner; bh=AnIJXTriu6pGcNgjDY3mgVIH8xVPnB0aftnKrlvvGO8=; b=ygr7S1GI5uydgiZqgcrTYU5Ozw TyB0gCm3FWrNteQyrJvredpZc5Pe8CfLNFRQH7GOKt+MTyf64R1aI+nA/MeMPZN+e1N5m+niCU3jO fgi8RmJ6o87+JyhKtafVphaLgUo0hLeBcygtYuy7MUSZE4HPpPb9+31ojb7Hww0sEbsXM1JhYBg0Z xANZchwedl0r1bZMWBloPqTwF+XjuuTus2eDVushYFnl5nOXiKzXmij1m1gROFh07l73iFyxxcoLg khgBbd0WA8YQk9K6skPtSzFTkfsIf2C1nJm8vphsQg05id51UW+uOHkYwVCn4mOeD0ffaEMFyrSkv hCdVrxDg==; Received: from localhost ([::1] helo=bombadil.infradead.org) by bombadil.infradead.org with esmtp (Exim 4.98 #2 (Red Hat Linux)) id 1swgnh-0000000CB6O-3NPX; Fri, 04 Oct 2024 11:48:49 +0000 Received: from mx0b-001b2d01.pphosted.com ([148.163.158.5]) by bombadil.infradead.org with esmtps (Exim 4.98 #2 (Red Hat Linux)) id 1swgmV-0000000CAoo-0KR1 for linux-nvme@lists.infradead.org; Fri, 04 Oct 2024 11:47:36 +0000 Received: from pps.filterd (m0353725.ppops.net [127.0.0.1]) by mx0a-001b2d01.pphosted.com (8.18.1.2/8.18.1.2) with ESMTP id 494BJsoW021003; Fri, 4 Oct 2024 11:47:23 GMT DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=ibm.com; h=from :to:cc:subject:date:message-id:in-reply-to:references :mime-version:content-transfer-encoding; s=pp1; bh=AnIJXTriu6pGc NgjDY3mgVIH8xVPnB0aftnKrlvvGO8=; b=f1IIU4kGvlSoGrRjISS13luXQSZCv 6MVYgqbIsFHFxlp5qIeVsU+mP+DC/4PgN2ZHHYl2xHV8QdXGnQKoInioy58wutN0 ej5GqfhQ+XgUlcRlED4MAQU3luXTdhnjjIHhcgzcCqljsgMenEQvKGvIDtgPbo3Q 18mvzmTdwfKPkglV1fs1NcjAeM8kA9U7AielcL9EJraMDQ4Gl2PDgl+9G2FWC00M pnJdJEoFmJS7cFQAjyxZoALvtaQ0ADgmXzTgBlcaOzQypLPcRhpux+V3h3mEfEg/ ngcuCpIg7xBJ/Mn26BYAVlOLxEYsP9UIUW1rj85IkOURvXxHAOgncr9IQ== Received: from ppma12.dal12v.mail.ibm.com (dc.9e.1632.ip4.static.sl-reverse.com [50.22.158.220]) by mx0a-001b2d01.pphosted.com (PPS) with ESMTPS id 422fd0g527-1 (version=TLSv1.2 cipher=ECDHE-RSA-AES256-GCM-SHA384 bits=256 verify=NOT); Fri, 04 Oct 2024 11:47:23 +0000 (GMT) Received: from pps.filterd (ppma12.dal12v.mail.ibm.com [127.0.0.1]) by ppma12.dal12v.mail.ibm.com (8.18.1.2/8.18.1.2) with ESMTP id 494ASQU8002685; Fri, 4 Oct 2024 11:47:22 GMT Received: from smtprelay03.fra02v.mail.ibm.com ([9.218.2.224]) by ppma12.dal12v.mail.ibm.com (PPS) with ESMTPS id 42207jkw49-1 (version=TLSv1.2 cipher=ECDHE-RSA-AES256-GCM-SHA384 bits=256 verify=NOT); Fri, 04 Oct 2024 11:47:22 +0000 Received: from smtpav05.fra02v.mail.ibm.com (smtpav05.fra02v.mail.ibm.com [10.20.54.104]) by smtprelay03.fra02v.mail.ibm.com (8.14.9/8.14.9/NCO v10.0) with ESMTP id 494BlJ5w57278772 (version=TLSv1/SSLv3 cipher=DHE-RSA-AES256-GCM-SHA384 bits=256 verify=OK); Fri, 4 Oct 2024 11:47:19 GMT Received: from smtpav05.fra02v.mail.ibm.com (unknown [127.0.0.1]) by IMSVA (Postfix) with ESMTP id 11AD620043; Fri, 4 Oct 2024 11:47:19 +0000 (GMT) Received: from smtpav05.fra02v.mail.ibm.com (unknown [127.0.0.1]) by IMSVA (Postfix) with ESMTP id 1091C20040; Fri, 4 Oct 2024 11:47:17 +0000 (GMT) Received: from li-c9696b4c-3419-11b2-a85c-f9edc3bf8a84.in.ibm.com (unknown [9.109.198.143]) by smtpav05.fra02v.mail.ibm.com (Postfix) with ESMTP; Fri, 4 Oct 2024 11:47:16 +0000 (GMT) From: Nilay Shroff To: linux-nvme@lists.infradead.org Cc: kbusch@kernel.org, hch@lst.de, sagi@grimberg.me, axboe@fb.com, chaitanyak@nvidia.com, gjoyce@linux.ibm.com, Nilay Shroff Subject: [PATCH 2/2] nvme: make keep-alive synchronous operation Date: Fri, 4 Oct 2024 17:16:57 +0530 Message-ID: <20241004114711.780809-3-nilay@linux.ibm.com> X-Mailer: git-send-email 2.45.2 In-Reply-To: <20241004114711.780809-1-nilay@linux.ibm.com> References: <20241004114711.780809-1-nilay@linux.ibm.com> MIME-Version: 1.0 Content-Transfer-Encoding: 8bit X-TM-AS-GCONF: 00 X-Proofpoint-GUID: uWAbV_9F6DOgHXY__XczF0syslVHNfwk X-Proofpoint-ORIG-GUID: uWAbV_9F6DOgHXY__XczF0syslVHNfwk X-Proofpoint-Virus-Version: vendor=baseguard engine=ICAP:2.0.293,Aquarius:18.0.1051,Hydra:6.0.680,FMLib:17.12.62.30 definitions=2024-10-04_08,2024-10-03_01,2024-09-30_01 X-Proofpoint-Spam-Details: rule=outbound_notspam policy=outbound score=0 mlxlogscore=999 malwarescore=0 adultscore=0 suspectscore=0 priorityscore=1501 phishscore=0 lowpriorityscore=0 clxscore=1015 bulkscore=0 spamscore=0 impostorscore=0 mlxscore=0 classifier=spam adjust=0 reason=mlx scancount=1 engine=8.19.0-2409260000 definitions=main-2410040084 X-CRM114-Version: 20100106-BlameMichelson ( TRE 0.8.0 (BSD) ) MR-646709E3 X-CRM114-CacheID: sfid-20241004_044735_266865_8D66E4AD X-CRM114-Status: GOOD ( 16.77 ) X-BeenThere: linux-nvme@lists.infradead.org X-Mailman-Version: 2.1.34 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Sender: "Linux-nvme" Errors-To: linux-nvme-bounces+linux-nvme=archiver.kernel.org@lists.infradead.org The nvme keep-alive operation, which executes at a periodic interval, could potentially sneak in while shutting down a fabric controller. This may lead to a race between the fabric controller admin queue destroy code path (while shutting down controller) and the blk-mq hw/hctx queuing from the keep-alive thread. This fix helps avoid race by implementing keep-alive as a synchronous operation so that admin queue-usage ref counter is decremented only after keep-alive command finish execution and returns its status. This would ensure that we don't inadvertently destroy the fabric admin queue until we finish processing of nvme keep-alive request and its status and hence it's safe to delete the queue. Also, while we are at it, instead of first acquiring ctrl lock and then accessing NVMe controller state, lets use the helper function nvme_ctrl_state() in nvme_keep_alive_end_io() and get rid of the lock. Signed-off-by: Nilay Shroff --- drivers/nvme/host/core.c | 25 ++++++++++--------------- 1 file changed, 10 insertions(+), 15 deletions(-) diff --git a/drivers/nvme/host/core.c b/drivers/nvme/host/core.c index 02897f0564a3..5a690cf16e5e 100644 --- a/drivers/nvme/host/core.c +++ b/drivers/nvme/host/core.c @@ -1292,14 +1292,14 @@ static void nvme_queue_keep_alive_work(struct nvme_ctrl *ctrl) queue_delayed_work(nvme_wq, &ctrl->ka_work, delay); } -static enum rq_end_io_ret nvme_keep_alive_end_io(struct request *rq, - blk_status_t status) +static void nvme_keep_alive_finish(struct request *rq, + blk_status_t status, + struct nvme_ctrl *ctrl) { - struct nvme_ctrl *ctrl = rq->end_io_data; - unsigned long flags; bool startka = false; unsigned long rtt = jiffies - (rq->deadline - rq->timeout); unsigned long delay = nvme_keep_alive_work_period(ctrl); + enum nvme_ctrl_state state = nvme_ctrl_state(ctrl); /* * Subtract off the keepalive RTT so nvme_keep_alive_work runs @@ -1313,25 +1313,19 @@ static enum rq_end_io_ret nvme_keep_alive_end_io(struct request *rq, delay = 0; } - blk_mq_free_request(rq); - if (status) { dev_err(ctrl->device, "failed nvme_keep_alive_end_io error=%d\n", status); - return RQ_END_IO_NONE; + return; } ctrl->ka_last_check_time = jiffies; ctrl->comp_seen = false; - spin_lock_irqsave(&ctrl->lock, flags); - if (ctrl->state == NVME_CTRL_LIVE || - ctrl->state == NVME_CTRL_CONNECTING) + if (state == NVME_CTRL_LIVE || state == NVME_CTRL_CONNECTING) startka = true; - spin_unlock_irqrestore(&ctrl->lock, flags); if (startka) queue_delayed_work(nvme_wq, &ctrl->ka_work, delay); - return RQ_END_IO_NONE; } static void nvme_keep_alive_work(struct work_struct *work) @@ -1340,6 +1334,7 @@ static void nvme_keep_alive_work(struct work_struct *work) struct nvme_ctrl, ka_work); bool comp_seen = ctrl->comp_seen; struct request *rq; + blk_status_t status; ctrl->ka_last_check_time = jiffies; @@ -1362,9 +1357,9 @@ static void nvme_keep_alive_work(struct work_struct *work) nvme_init_request(rq, &ctrl->ka_cmd); rq->timeout = ctrl->kato * HZ; - rq->end_io = nvme_keep_alive_end_io; - rq->end_io_data = ctrl; - blk_execute_rq_nowait(rq, false); + status = blk_execute_rq(rq, false); + nvme_keep_alive_finish(rq, status, ctrl); + blk_mq_free_request(rq); } static void nvme_start_keep_alive(struct nvme_ctrl *ctrl) -- 2.45.2