From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id 93E34C6FA8E for ; Mon, 27 Feb 2023 02:02:22 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S229863AbjB0CCV (ORCPT ); Sun, 26 Feb 2023 21:02:21 -0500 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:52818 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S229688AbjB0CCA (ORCPT ); Sun, 26 Feb 2023 21:02:00 -0500 Received: from sin.source.kernel.org (sin.source.kernel.org [IPv6:2604:1380:40e1:4800::1]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 992E013DC4; Sun, 26 Feb 2023 18:01:43 -0800 (PST) Received: from smtp.kernel.org (relay.kernel.org [52.25.139.140]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by sin.source.kernel.org (Postfix) with ESMTPS id AFD58CE0F25; Mon, 27 Feb 2023 02:01:41 +0000 (UTC) Received: by smtp.kernel.org (Postfix) with ESMTPSA id D6238C4339C; Mon, 27 Feb 2023 02:01:38 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=kernel.org; s=k20201202; t=1677463300; bh=Ikco7Thhh8K6W/4hrM2WIUukC4nBOz97oM872JFtFSU=; h=From:To:Cc:Subject:Date:In-Reply-To:References:From; b=ThIiGMwM54319feeuCx0iE9MBfCKZi5UXcK3IsYr47y7OYqazR4XBgweyjOdxPF0Y H5Fc+0XLaaBuyOh4jId6GAqd0/NnZWjLOoJOXNrJ7Lv4MJLwLcyNDtZYP9IpblyvFi V9KfqLc5SU3Z3ViiQuOb801Mxujfkkc5p6bZt1x13dVvNw32IuHdpNcgvoiFW0zO1o hUYcypKpHwKwbNGf6m1Zzw7KoILOHaF4p26WT5KM8FA8LY3ZI08a75QUGq//G39jja pSJH9pwZ7UQ3WW/B+f5UjhztSWEJ+sL9xHHCYwK1kw334Rkl7HMPMAfusy4KsNb1LU 5iAFGN9G2qqLg== From: Sasha Levin To: linux-kernel@vger.kernel.org, stable@vger.kernel.org Cc: Philip Yang , Felix Kuehling , Alex Deucher , Sasha Levin , christian.koenig@amd.com, Xinhui.Pan@amd.com, airlied@gmail.com, daniel@ffwll.ch, amd-gfx@lists.freedesktop.org, dri-devel@lists.freedesktop.org Subject: [PATCH AUTOSEL 6.2 12/60] drm/amdkfd: Page aligned memory reserve size Date: Sun, 26 Feb 2023 20:59:57 -0500 Message-Id: <20230227020045.1045105-12-sashal@kernel.org> X-Mailer: git-send-email 2.39.0 In-Reply-To: <20230227020045.1045105-1-sashal@kernel.org> References: <20230227020045.1045105-1-sashal@kernel.org> MIME-Version: 1.0 X-stable: review X-Patchwork-Hint: Ignore Content-Transfer-Encoding: 8bit Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org From: Philip Yang [ Upstream commit 0c2dece8fb541ab07b68c3312a1065fa9c927a81 ] Use page aligned size to reserve memory usage because page aligned TTM BO size is used to unreserve memory usage, otherwise no page aligned size causes memory usage accounting unbalanced. Change vram_used definition type to int64_t to be able to trigger WARN_ONCE(adev && adev->kfd.vram_used < 0, "..."), to help debug the accounting issue with warning and backtrace. Signed-off-by: Philip Yang Reviewed-by: Felix Kuehling Signed-off-by: Alex Deucher Signed-off-by: Sasha Levin --- drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd.h | 2 +- drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd_gpuvm.c | 12 +++++++----- drivers/gpu/drm/amd/amdkfd/kfd_chardev.c | 9 +++++++-- 3 files changed, 15 insertions(+), 8 deletions(-) diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd.h b/drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd.h index 0040deaf8a83a..90a5254ec1387 100644 --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd.h +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd.h @@ -97,7 +97,7 @@ struct amdgpu_amdkfd_fence { struct amdgpu_kfd_dev { struct kfd_dev *dev; - uint64_t vram_used; + int64_t vram_used; uint64_t vram_used_aligned; bool init_complete; struct work_struct reset_work; diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd_gpuvm.c b/drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd_gpuvm.c index 3b5c53712d319..05b884fe0a927 100644 --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd_gpuvm.c +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd_gpuvm.c @@ -1612,6 +1612,7 @@ int amdgpu_amdkfd_gpuvm_alloc_memory_of_gpu( struct amdgpu_bo *bo; struct drm_gem_object *gobj = NULL; u32 domain, alloc_domain; + uint64_t aligned_size; u64 alloc_flags; int ret; @@ -1667,22 +1668,23 @@ int amdgpu_amdkfd_gpuvm_alloc_memory_of_gpu( * the memory. */ if ((*mem)->aql_queue) - size = size >> 1; + size >>= 1; + aligned_size = PAGE_ALIGN(size); (*mem)->alloc_flags = flags; amdgpu_sync_create(&(*mem)->sync); - ret = amdgpu_amdkfd_reserve_mem_limit(adev, size, flags); + ret = amdgpu_amdkfd_reserve_mem_limit(adev, aligned_size, flags); if (ret) { pr_debug("Insufficient memory\n"); goto err_reserve_limit; } pr_debug("\tcreate BO VA 0x%llx size 0x%llx domain %s\n", - va, size, domain_string(alloc_domain)); + va, (*mem)->aql_queue ? size << 1 : size, domain_string(alloc_domain)); - ret = amdgpu_gem_object_create(adev, size, 1, alloc_domain, alloc_flags, + ret = amdgpu_gem_object_create(adev, aligned_size, 1, alloc_domain, alloc_flags, bo_type, NULL, &gobj); if (ret) { pr_debug("Failed to create BO on domain %s. ret %d\n", @@ -1739,7 +1741,7 @@ int amdgpu_amdkfd_gpuvm_alloc_memory_of_gpu( /* Don't unreserve system mem limit twice */ goto err_reserve_limit; err_bo_create: - amdgpu_amdkfd_unreserve_mem_limit(adev, size, flags); + amdgpu_amdkfd_unreserve_mem_limit(adev, aligned_size, flags); err_reserve_limit: mutex_destroy(&(*mem)->lock); if (gobj) diff --git a/drivers/gpu/drm/amd/amdkfd/kfd_chardev.c b/drivers/gpu/drm/amd/amdkfd/kfd_chardev.c index 6d291aa6386bd..f79b8e964140e 100644 --- a/drivers/gpu/drm/amd/amdkfd/kfd_chardev.c +++ b/drivers/gpu/drm/amd/amdkfd/kfd_chardev.c @@ -1127,8 +1127,13 @@ static int kfd_ioctl_alloc_memory_of_gpu(struct file *filep, } /* Update the VRAM usage count */ - if (flags & KFD_IOC_ALLOC_MEM_FLAGS_VRAM) - WRITE_ONCE(pdd->vram_usage, pdd->vram_usage + args->size); + if (flags & KFD_IOC_ALLOC_MEM_FLAGS_VRAM) { + uint64_t size = args->size; + + if (flags & KFD_IOC_ALLOC_MEM_FLAGS_AQL_QUEUE_MEM) + size >>= 1; + WRITE_ONCE(pdd->vram_usage, pdd->vram_usage + PAGE_ALIGN(size)); + } mutex_unlock(&p->mutex); -- 2.39.0