From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from gabe.freedesktop.org (gabe.freedesktop.org [131.252.210.177]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.lore.kernel.org (Postfix) with ESMTPS id B74D1CA0EED for ; Fri, 29 Aug 2025 02:23:49 +0000 (UTC) Received: from gabe.freedesktop.org (localhost [127.0.0.1]) by gabe.freedesktop.org (Postfix) with ESMTP id 24EBB10EB2C; Fri, 29 Aug 2025 02:23:49 +0000 (UTC) X-Greylist: delayed 422 seconds by postgrey-1.36 at gabe; Fri, 29 Aug 2025 02:23:47 UTC Received: from us-smtp-delivery-44.mimecast.com (us-smtp-delivery-44.mimecast.com [205.139.111.44]) by gabe.freedesktop.org (Postfix) with ESMTPS id 818EB10EB2C for ; Fri, 29 Aug 2025 02:23:47 +0000 (UTC) Received: from mx-prod-mc-06.mail-002.prod.us-west-2.aws.redhat.com (ec2-35-165-154-97.us-west-2.compute.amazonaws.com [35.165.154.97]) by relay.mimecast.com with ESMTP with STARTTLS (version=TLSv1.3, cipher=TLS_AES_256_GCM_SHA384) id us-mta-520-zKWTYmqAOsWOxTzzCQrCZA-1; Thu, 28 Aug 2025 22:16:43 -0400 X-MC-Unique: zKWTYmqAOsWOxTzzCQrCZA-1 X-Mimecast-MFC-AGG-ID: zKWTYmqAOsWOxTzzCQrCZA_1756433802 Received: from mx-prod-int-03.mail-002.prod.us-west-2.aws.redhat.com (mx-prod-int-03.mail-002.prod.us-west-2.aws.redhat.com [10.30.177.12]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature RSA-PSS (2048 bits) server-digest SHA256) (No client certificate requested) by mx-prod-mc-06.mail-002.prod.us-west-2.aws.redhat.com (Postfix) with ESMTPS id 2ED2818004D4; Fri, 29 Aug 2025 02:16:42 +0000 (UTC) Received: from dreadlord.redhat.com (unknown [10.67.32.4]) by mx-prod-int-03.mail-002.prod.us-west-2.aws.redhat.com (Postfix) with ESMTP id BE0CD19560B4; Fri, 29 Aug 2025 02:16:39 +0000 (UTC) From: Dave Airlie To: dri-devel@lists.freedesktop.org Cc: nouveau@lists.freedesktop.org, dakr@kernel.org Subject: [PATCH 2/2] nouveau: Membar before between semaphore writes and the interrupt Date: Fri, 29 Aug 2025 12:16:33 +1000 Message-ID: <20250829021633.1674524-2-airlied@gmail.com> In-Reply-To: <20250829021633.1674524-1-airlied@gmail.com> References: <20250829021633.1674524-1-airlied@gmail.com> MIME-Version: 1.0 X-Scanned-By: MIMEDefang 3.0 on 10.30.177.12 X-Mimecast-Spam-Score: 0 X-Mimecast-MFC-PROC-ID: 87lc7ukDGNuRJDfhWrmtO-_TjWblYLp6c-Ha2PQnHb8_1756433802 X-Mimecast-Originator: gmail.com Content-Transfer-Encoding: quoted-printable content-type: text/plain; charset=WINDOWS-1252; x-default=true X-BeenThere: dri-devel@lists.freedesktop.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: Direct Rendering Infrastructure - Development List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: dri-devel-bounces@lists.freedesktop.org Sender: "dri-devel" From: Faith Ekstrand This ensures that the memory write and the interrupt are properly ordered and we won't wake up the kernel before the semaphore write has hit memory. Fixes: b1ca384772b6 ("drm/nouveau/gv100-: switch to volta semaphore methods= ") Cc: stable@vger.kernel.org Signed-off-by: Faith Ekstrand Signed-off-by: Dave Airlie --- drivers/gpu/drm/nouveau/gv100_fence.c | 7 +- .../drm/nouveau/include/nvhw/class/clc36f.h | 85 +++++++++++++++++++ 2 files changed, 91 insertions(+), 1 deletion(-) diff --git a/drivers/gpu/drm/nouveau/gv100_fence.c b/drivers/gpu/drm/nouvea= u/gv100_fence.c index cccdeca72002..317e516c4ec7 100644 --- a/drivers/gpu/drm/nouveau/gv100_fence.c +++ b/drivers/gpu/drm/nouveau/gv100_fence.c @@ -18,7 +18,7 @@ gv100_fence_emit32(struct nouveau_channel *chan, u64 virt= ual, u32 sequence) =09struct nvif_push *push =3D &chan->chan.push; =09int ret; =20 -=09ret =3D PUSH_WAIT(push, 8); +=09ret =3D PUSH_WAIT(push, 13); =09if (ret) =09=09return ret; =20 @@ -32,6 +32,11 @@ gv100_fence_emit32(struct nouveau_channel *chan, u64 vir= tual, u32 sequence) =09=09 NVDEF(NVC36F, SEM_EXECUTE, PAYLOAD_SIZE, 32BIT) | =09=09 NVDEF(NVC36F, SEM_EXECUTE, RELEASE_TIMESTAMP, DIS)); =20 +=09PUSH_MTHD(push, NVC36F, MEM_OP_A, 0, +=09=09=09=09MEM_OP_B, 0, +=09=09=09=09MEM_OP_C, NVDEF(NVC36F, MEM_OP_C, MEMBAR_TYPE, SYS_MEMBAR), +=09=09=09=09MEM_OP_D, NVDEF(NVC36F, MEM_OP_D, OPERATION, MEMBAR)); + =09PUSH_MTHD(push, NVC36F, NON_STALL_INTERRUPT, 0); =20 =09PUSH_KICK(push); diff --git a/drivers/gpu/drm/nouveau/include/nvhw/class/clc36f.h b/drivers/= gpu/drm/nouveau/include/nvhw/class/clc36f.h index 8735dda4c8a7..338f74b9f501 100644 --- a/drivers/gpu/drm/nouveau/include/nvhw/class/clc36f.h +++ b/drivers/gpu/drm/nouveau/include/nvhw/class/clc36f.h @@ -7,6 +7,91 @@ =20 #define NVC36F_NON_STALL_INTERRUPT (0x0000= 0020) #define NVC36F_NON_STALL_INTERRUPT_HANDLE = 31:0 +// NOTE - MEM_OP_A and MEM_OP_B have been replaced in gp100 with methods f= or +// specifying the page address for a targeted TLB invalidate and the uTLB = for +// a targeted REPLAY_CANCEL for UVM. +// The previous MEM_OP_A/B functionality is in MEM_OP_C/D, with slightly +// rearranged fields. +#define NVC36F_MEM_OP_A (0x0000= 0028) +#define NVC36F_MEM_OP_A_TLB_INVALIDATE_CANCEL_TARGET_CLIENT_UNIT_ID = 5:0 // only relevant for REPLAY_CANCEL_TARGETED +#define NVC36F_MEM_OP_A_TLB_INVALIDATE_INVALIDATION_SIZE = 5:0 // Used to specify size of invalidate, used for invalidates which are= not of the REPLAY_CANCEL_TARGETED type +#define NVC36F_MEM_OP_A_TLB_INVALIDATE_CANCEL_TARGET_GPC_ID = 10:6 // only relevant for REPLAY_CANCEL_TARGETED +#define NVC36F_MEM_OP_A_TLB_INVALIDATE_CANCEL_MMU_ENGINE_ID = 6:0 // only relevant for REPLAY_CANCEL_VA_GLOBAL +#define NVC36F_MEM_OP_A_TLB_INVALIDATE_SYSMEMBAR 1= 1:11 +#define NVC36F_MEM_OP_A_TLB_INVALIDATE_SYSMEMBAR_EN 0x0000= 0001 +#define NVC36F_MEM_OP_A_TLB_INVALIDATE_SYSMEMBAR_DIS 0x0000= 0000 +#define NVC36F_MEM_OP_A_TLB_INVALIDATE_TARGET_ADDR_LO 3= 1:12 +#define NVC36F_MEM_OP_B (0x0000= 002c) +#define NVC36F_MEM_OP_B_TLB_INVALIDATE_TARGET_ADDR_HI = 31:0 +#define NVC36F_MEM_OP_C (0x0000= 0030) +#define NVC36F_MEM_OP_C_MEMBAR_TYPE = 2:0 +#define NVC36F_MEM_OP_C_MEMBAR_TYPE_SYS_MEMBAR 0x0000= 0000 +#define NVC36F_MEM_OP_C_MEMBAR_TYPE_MEMBAR 0x0000= 0001 +#define NVC36F_MEM_OP_C_TLB_INVALIDATE_PDB = 0:0 +#define NVC36F_MEM_OP_C_TLB_INVALIDATE_PDB_ONE 0x0000= 0000 +#define NVC36F_MEM_OP_C_TLB_INVALIDATE_PDB_ALL 0x0000= 0001 // Probably nonsensical for MMU_TLB_INVALIDATE_TARGETED +#define NVC36F_MEM_OP_C_TLB_INVALIDATE_GPC = 1:1 +#define NVC36F_MEM_OP_C_TLB_INVALIDATE_GPC_ENABLE 0x0000= 0000 +#define NVC36F_MEM_OP_C_TLB_INVALIDATE_GPC_DISABLE 0x0000= 0001 +#define NVC36F_MEM_OP_C_TLB_INVALIDATE_REPLAY = 4:2 // only relevant if GPC ENABLE +#define NVC36F_MEM_OP_C_TLB_INVALIDATE_REPLAY_NONE 0x0000= 0000 +#define NVC36F_MEM_OP_C_TLB_INVALIDATE_REPLAY_START 0x0000= 0001 +#define NVC36F_MEM_OP_C_TLB_INVALIDATE_REPLAY_START_ACK_ALL 0x0000= 0002 +#define NVC36F_MEM_OP_C_TLB_INVALIDATE_REPLAY_CANCEL_TARGETED 0x0000= 0003 +#define NVC36F_MEM_OP_C_TLB_INVALIDATE_REPLAY_CANCEL_GLOBAL 0x0000= 0004 +#define NVC36F_MEM_OP_C_TLB_INVALIDATE_REPLAY_CANCEL_VA_GLOBAL 0x0000= 0005 +#define NVC36F_MEM_OP_C_TLB_INVALIDATE_ACK_TYPE = 6:5 // only relevant if GPC ENABLE +#define NVC36F_MEM_OP_C_TLB_INVALIDATE_ACK_TYPE_NONE 0x0000= 0000 +#define NVC36F_MEM_OP_C_TLB_INVALIDATE_ACK_TYPE_GLOBALLY 0x0000= 0001 +#define NVC36F_MEM_OP_C_TLB_INVALIDATE_ACK_TYPE_INTRANODE 0x0000= 0002 +#define NVC36F_MEM_OP_C_TLB_INVALIDATE_ACCESS_TYPE = 9:7 //only relevant for REPLAY_CANCEL_VA_GLOBAL +#define NVC36F_MEM_OP_C_TLB_INVALIDATE_ACCESS_TYPE_VIRT_READ = 0 +#define NVC36F_MEM_OP_C_TLB_INVALIDATE_ACCESS_TYPE_VIRT_WRITE = 1 +#define NVC36F_MEM_OP_C_TLB_INVALIDATE_ACCESS_TYPE_VIRT_ATOMIC_STRONG = 2 +#define NVC36F_MEM_OP_C_TLB_INVALIDATE_ACCESS_TYPE_VIRT_RSVRVD = 3 +#define NVC36F_MEM_OP_C_TLB_INVALIDATE_ACCESS_TYPE_VIRT_ATOMIC_WEAK = 4 +#define NVC36F_MEM_OP_C_TLB_INVALIDATE_ACCESS_TYPE_VIRT_ATOMIC_ALL = 5 +#define NVC36F_MEM_OP_C_TLB_INVALIDATE_ACCESS_TYPE_VIRT_WRITE_AND_ATOMIC = 6 +#define NVC36F_MEM_OP_C_TLB_INVALIDATE_ACCESS_TYPE_VIRT_ALL = 7 +#define NVC36F_MEM_OP_C_TLB_INVALIDATE_PAGE_TABLE_LEVEL = 9:7 // Invalidate affects this level and all below +#define NVC36F_MEM_OP_C_TLB_INVALIDATE_PAGE_TABLE_LEVEL_ALL 0x0000= 0000 // Invalidate tlb caches at all levels of the page table +#define NVC36F_MEM_OP_C_TLB_INVALIDATE_PAGE_TABLE_LEVEL_PTE_ONLY 0x0000= 0001 +#define NVC36F_MEM_OP_C_TLB_INVALIDATE_PAGE_TABLE_LEVEL_UP_TO_PDE0 0x0000= 0002 +#define NVC36F_MEM_OP_C_TLB_INVALIDATE_PAGE_TABLE_LEVEL_UP_TO_PDE1 0x0000= 0003 +#define NVC36F_MEM_OP_C_TLB_INVALIDATE_PAGE_TABLE_LEVEL_UP_TO_PDE2 0x0000= 0004 +#define NVC36F_MEM_OP_C_TLB_INVALIDATE_PAGE_TABLE_LEVEL_UP_TO_PDE3 0x0000= 0005 +#define NVC36F_MEM_OP_C_TLB_INVALIDATE_PAGE_TABLE_LEVEL_UP_TO_PDE4 0x0000= 0006 +#define NVC36F_MEM_OP_C_TLB_INVALIDATE_PAGE_TABLE_LEVEL_UP_TO_PDE5 0x0000= 0007 +#define NVC36F_MEM_OP_C_TLB_INVALIDATE_PDB_APERTURE = 11:10 // only relevant if PDB_ONE +#define NVC36F_MEM_OP_C_TLB_INVALIDATE_PDB_APERTURE_VID_MEM 0x= 00000000 +#define NVC36F_MEM_OP_C_TLB_INVALIDATE_PDB_APERTURE_SYS_MEM_COHERENT 0x= 00000002 +#define NVC36F_MEM_OP_C_TLB_INVALIDATE_PDB_APERTURE_SYS_MEM_NONCOHERENT 0x= 00000003 +#define NVC36F_MEM_OP_C_TLB_INVALIDATE_PDB_ADDR_LO 3= 1:12 // only relevant if PDB_ONE +#define NVC36F_MEM_OP_C_ACCESS_COUNTER_CLR_TARGETED_NOTIFY_TAG = 19:0 +// MEM_OP_D MUST be preceded by MEM_OPs A-C. +#define NVC36F_MEM_OP_D (0x0000= 0034) +#define NVC36F_MEM_OP_D_TLB_INVALIDATE_PDB_ADDR_HI = 26:0 // only relevant if PDB_ONE +#define NVC36F_MEM_OP_D_OPERATION 3= 1:27 +#define NVC36F_MEM_OP_D_OPERATION_MEMBAR 0x0000= 0005 +#define NVC36F_MEM_OP_D_OPERATION_MMU_TLB_INVALIDATE 0x0000= 0009 +#define NVC36F_MEM_OP_D_OPERATION_MMU_TLB_INVALIDATE_TARGETED 0x0000= 000a +#define NVC36F_MEM_OP_D_OPERATION_L2_PEERMEM_INVALIDATE 0x0000= 000d +#define NVC36F_MEM_OP_D_OPERATION_L2_SYSMEM_INVALIDATE 0x0000= 000e +// CLEAN_LINES is an alias for Tegra/GPU IP usage +#define NVC36F_MEM_OP_B_OPERATION_L2_INVALIDATE_CLEAN_LINES 0x0000= 000e +#define NVC36F_MEM_OP_D_OPERATION_L2_CLEAN_COMPTAGS 0x0000= 000f +#define NVC36F_MEM_OP_D_OPERATION_L2_FLUSH_DIRTY 0x0000= 0010 +#define NVC36F_MEM_OP_D_OPERATION_L2_WAIT_FOR_SYS_PENDING_READS 0x0000= 0015 +#define NVC36F_MEM_OP_D_OPERATION_ACCESS_COUNTER_CLR 0x0000= 0016 +#define NVC36F_MEM_OP_D_ACCESS_COUNTER_CLR_TYPE = 1:0 +#define NVC36F_MEM_OP_D_ACCESS_COUNTER_CLR_TYPE_MIMC 0x0000= 0000 +#define NVC36F_MEM_OP_D_ACCESS_COUNTER_CLR_TYPE_MOMC 0x0000= 0001 +#define NVC36F_MEM_OP_D_ACCESS_COUNTER_CLR_TYPE_ALL 0x0000= 0002 +#define NVC36F_MEM_OP_D_ACCESS_COUNTER_CLR_TYPE_TARGETED 0x0000= 0003 +#define NVC36F_MEM_OP_D_ACCESS_COUNTER_CLR_TARGETED_TYPE = 2:2 +#define NVC36F_MEM_OP_D_ACCESS_COUNTER_CLR_TARGETED_TYPE_MIMC 0x0000= 0000 +#define NVC36F_MEM_OP_D_ACCESS_COUNTER_CLR_TARGETED_TYPE_MOMC 0x0000= 0001 +#define NVC36F_MEM_OP_D_ACCESS_COUNTER_CLR_TARGETED_BANK = 6:3 #define NVC36F_SEM_ADDR_LO (0x0000= 005c) #define NVC36F_SEM_ADDR_LO_OFFSET = 31:2 #define NVC36F_SEM_ADDR_HI (0x0000= 0060) --=20 2.50.1