From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id 9F4B9C77B73 for ; Mon, 1 May 2023 02:58:54 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S232367AbjEAC6x (ORCPT ); Sun, 30 Apr 2023 22:58:53 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:49056 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S232234AbjEAC61 (ORCPT ); Sun, 30 Apr 2023 22:58:27 -0400 Received: from dfw.source.kernel.org (dfw.source.kernel.org [139.178.84.217]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id B7C251BD4; Sun, 30 Apr 2023 19:57:57 -0700 (PDT) Received: from smtp.kernel.org (relay.kernel.org [52.25.139.140]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by dfw.source.kernel.org (Postfix) with ESMTPS id 9656F615E6; Mon, 1 May 2023 02:57:33 +0000 (UTC) Received: by smtp.kernel.org (Postfix) with ESMTPSA id 40D86C433EF; Mon, 1 May 2023 02:57:31 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=kernel.org; s=k20201202; t=1682909853; bh=CG6iidfJYvBtjEQvRVQLTsIrpzO11PNYCTvVzvScUro=; h=From:To:Cc:Subject:Date:In-Reply-To:References:From; b=t0mtl1JDv4A5vuIMdKDjMJE2ASvu10tAbFNmtUZUhShZIhgcYI+VSK/Iueuuvgfik 8REf+jLM2C7TB0WlRIwvYJ/p4Hd4m5XucPs4R4Bh8vWij1i0cW7NgzJRJl7AxLxsi0 JtSZ2O++0Qndg/bdLW3lS2PdF0srr9iKFlpMoc0s0++SqEBDyKRHkxOx6Mf6CY0Nya rdOagVrsyLqGHHn3xrYHzPjVUS3uC058qK3LooD7wutw5RDQnEr05pJRr3sfIFBlGQ nnWO6GgEjhcmkMXL2mJVErkZblVhCDhs6Ow3XB1EmQKadg4Rqi9vvi+gk3PGodbccb 4S1VEq7x9f8wg== From: Sasha Levin To: linux-kernel@vger.kernel.org, stable@vger.kernel.org Cc: Tomer Tayar , Oded Gabbay , Sasha Levin , stanislaw.gruszka@linux.intel.com, gregkh@linuxfoundation.org, dhirschfeld@habana.ai, dliberman@habana.ai, osharabi@habana.ai, obitton@habana.ai, talcohen@habana.ai, ynudelman@habana.ai, fkassabri@habana.ai, bjauhari@habana.ai, colin.i.king@gmail.com, dri-devel@lists.freedesktop.org Subject: [PATCH AUTOSEL 6.3 11/44] accel/habanalabs: postpone mem_mgr IDR destruction to hpriv_release() Date: Sun, 30 Apr 2023 22:55:59 -0400 Message-Id: <20230501025632.3253067-11-sashal@kernel.org> X-Mailer: git-send-email 2.39.2 In-Reply-To: <20230501025632.3253067-1-sashal@kernel.org> References: <20230501025632.3253067-1-sashal@kernel.org> MIME-Version: 1.0 X-stable: review X-Patchwork-Hint: Ignore Content-Transfer-Encoding: 8bit Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org From: Tomer Tayar [ Upstream commit 2e8e9a895c4589f124a37fc84d123b5114406e94 ] The memory manager IDR is currently destroyed when user releases the file descriptor. However, at this point the user context might be still held, and memory buffers might be still in use. Later on, calls to release those buffers will fail due to not finding their handles in the IDR, leading to a memory leak. To avoid this leak, split the IDR destruction from the memory manager fini, and postpone it to hpriv_release() when there is no user context and no buffers are used. Signed-off-by: Tomer Tayar Reviewed-by: Oded Gabbay Signed-off-by: Oded Gabbay Signed-off-by: Sasha Levin --- drivers/accel/habanalabs/common/device.c | 9 +++++++++ drivers/accel/habanalabs/common/habanalabs.h | 1 + drivers/accel/habanalabs/common/habanalabs_drv.c | 1 + drivers/accel/habanalabs/common/memory_mgr.c | 13 ++++++++++++- 4 files changed, 23 insertions(+), 1 deletion(-) diff --git a/drivers/accel/habanalabs/common/device.c b/drivers/accel/habanalabs/common/device.c index 9933e5858a363..c91436609f080 100644 --- a/drivers/accel/habanalabs/common/device.c +++ b/drivers/accel/habanalabs/common/device.c @@ -423,6 +423,9 @@ static void hpriv_release(struct kref *ref) mutex_destroy(&hpriv->ctx_lock); mutex_destroy(&hpriv->restore_phase_mutex); + /* There should be no memory buffers at this point and handles IDR can be destroyed */ + hl_mem_mgr_idr_destroy(&hpriv->mem_mgr); + /* Device should be reset if reset-upon-device-release is enabled, or if there is a pending * reset that waits for device release. */ @@ -514,6 +517,10 @@ static int hl_device_release(struct inode *inode, struct file *filp) } hl_ctx_mgr_fini(hdev, &hpriv->ctx_mgr); + + /* Memory buffers might be still in use at this point and thus the handles IDR destruction + * is postponed to hpriv_release(). + */ hl_mem_mgr_fini(&hpriv->mem_mgr); hdev->compute_ctx_in_release = 1; @@ -887,6 +894,7 @@ static int device_early_init(struct hl_device *hdev) free_cb_mgr: hl_mem_mgr_fini(&hdev->kernel_mem_mgr); + hl_mem_mgr_idr_destroy(&hdev->kernel_mem_mgr); free_chip_info: kfree(hdev->hl_chip_info); free_prefetch_wq: @@ -930,6 +938,7 @@ static void device_early_fini(struct hl_device *hdev) mutex_destroy(&hdev->clk_throttling.lock); hl_mem_mgr_fini(&hdev->kernel_mem_mgr); + hl_mem_mgr_idr_destroy(&hdev->kernel_mem_mgr); kfree(hdev->hl_chip_info); diff --git a/drivers/accel/habanalabs/common/habanalabs.h b/drivers/accel/habanalabs/common/habanalabs.h index fa05e76d3d21a..829b30ab1961a 100644 --- a/drivers/accel/habanalabs/common/habanalabs.h +++ b/drivers/accel/habanalabs/common/habanalabs.h @@ -3861,6 +3861,7 @@ const char *hl_sync_engine_to_string(enum hl_sync_engine_type engine_type); void hl_mem_mgr_init(struct device *dev, struct hl_mem_mgr *mmg); void hl_mem_mgr_fini(struct hl_mem_mgr *mmg); +void hl_mem_mgr_idr_destroy(struct hl_mem_mgr *mmg); int hl_mem_mgr_mmap(struct hl_mem_mgr *mmg, struct vm_area_struct *vma, void *args); struct hl_mmap_mem_buf *hl_mmap_mem_buf_get(struct hl_mem_mgr *mmg, diff --git a/drivers/accel/habanalabs/common/habanalabs_drv.c b/drivers/accel/habanalabs/common/habanalabs_drv.c index 03dae57dc8386..e3781cfe8a7fe 100644 --- a/drivers/accel/habanalabs/common/habanalabs_drv.c +++ b/drivers/accel/habanalabs/common/habanalabs_drv.c @@ -237,6 +237,7 @@ int hl_device_open(struct inode *inode, struct file *filp) out_err: mutex_unlock(&hdev->fpriv_list_lock); hl_mem_mgr_fini(&hpriv->mem_mgr); + hl_mem_mgr_idr_destroy(&hpriv->mem_mgr); hl_ctx_mgr_fini(hpriv->hdev, &hpriv->ctx_mgr); filp->private_data = NULL; mutex_destroy(&hpriv->ctx_lock); diff --git a/drivers/accel/habanalabs/common/memory_mgr.c b/drivers/accel/habanalabs/common/memory_mgr.c index 0f2759e265477..f8e8261cc83d8 100644 --- a/drivers/accel/habanalabs/common/memory_mgr.c +++ b/drivers/accel/habanalabs/common/memory_mgr.c @@ -341,8 +341,19 @@ void hl_mem_mgr_fini(struct hl_mem_mgr *mmg) "%s: Buff handle %u for CTX is still alive\n", topic, id); } +} - /* TODO: can it happen that some buffer is still in use at this point? */ +/** + * hl_mem_mgr_idr_destroy() - destroy memory manager IDR. + * @mmg: parent unified memory manager + * + * Destroy the memory manager IDR. + * Shall be called when IDR is empty and no memory buffers are in use. + */ +void hl_mem_mgr_idr_destroy(struct hl_mem_mgr *mmg) +{ + if (!idr_is_empty(&mmg->handles)) + dev_crit(mmg->dev, "memory manager IDR is destroyed while it is not empty!\n"); idr_destroy(&mmg->handles); } -- 2.39.2