AMD-GFX Archive on lore.kernel.org
 help / color / mirror / Atom feed
* [PATCH 0/5] drm/amdkfd: Trap handler fixes and gfx12.1 support
@ 2026-01-16 20:39 Jay Cornwall
  2026-01-16 20:39 ` [PATCH 1/5] drm/amdkfd: Sync trap handler binary with source Jay Cornwall
                   ` (4 more replies)
  0 siblings, 5 replies; 14+ messages in thread
From: Jay Cornwall @ 2026-01-16 20:39 UTC (permalink / raw)
  To: amd-gfx; +Cc: Jay Cornwall

Fix a broken merge and upstream missing gfx12.1 changes.

Jay Cornwall (4):
  drm/amdkfd: Sync trap handler binary with source
  drm/amdkfd: Fix scalar load ordering in gfx12.1 trap handler
  drm/amdkfd: gfx12.1 cluster barrier context save workaround
  drm/amdkfd: gfx12.1 trap handler support for expert scheduling mode

Lancelot Six (1):
  drm/amdkfd: Do not include VGPR MSBs in saved PC during save

 .../gpu/drm/amd/amdkfd/cwsr_trap_handler.h    | 1435 ++++++++---------
 .../amd/amdkfd/cwsr_trap_handler_gfx12.asm    |   73 +-
 2 files changed, 744 insertions(+), 764 deletions(-)

-- 
2.34.1


^ permalink raw reply	[flat|nested] 14+ messages in thread

* [PATCH 1/5] drm/amdkfd: Sync trap handler binary with source
  2026-01-16 20:39 [PATCH 0/5] drm/amdkfd: Trap handler fixes and gfx12.1 support Jay Cornwall
@ 2026-01-16 20:39 ` Jay Cornwall
  2026-01-20 22:34   ` Lancelot SIX
  2026-01-16 20:39 ` [PATCH 2/5] drm/amdkfd: Fix scalar load ordering in gfx12.1 trap handler Jay Cornwall
                   ` (3 subsequent siblings)
  4 siblings, 1 reply; 14+ messages in thread
From: Jay Cornwall @ 2026-01-16 20:39 UTC (permalink / raw)
  To: amd-gfx; +Cc: Jay Cornwall, Lancelot Six, Vladimir Indic

Binary and source desynced during branch activity. Source merge
also introduced compile error.

Signed-off-by: Jay Cornwall <jay.cornwall@amd.com>
Cc: Lancelot Six <lancelot.six@amd.com>
Cc: Vladimir Indic <vladimir.indic@amd.com>
---
 .../gpu/drm/amd/amdkfd/cwsr_trap_handler.h    | 1742 ++++++++---------
 .../amd/amdkfd/cwsr_trap_handler_gfx12.asm    |    1 +
 2 files changed, 836 insertions(+), 907 deletions(-)

diff --git a/drivers/gpu/drm/amd/amdkfd/cwsr_trap_handler.h b/drivers/gpu/drm/amd/amdkfd/cwsr_trap_handler.h
index dfffda4aa8e2..6281b2f9faee 100644
--- a/drivers/gpu/drm/amd/amdkfd/cwsr_trap_handler.h
+++ b/drivers/gpu/drm/amd/amdkfd/cwsr_trap_handler.h
@@ -3644,7 +3644,7 @@ static const uint32_t cwsr_trap_gfx9_4_3_hex[] = {
 };
 
 static const uint32_t cwsr_trap_gfx12_hex[] = {
-	0xbfa00001, 0xbfa002b2,
+	0xbfa00001, 0xbfa00239,
 	0xb0804009, 0xb8eef81a,
 	0xbf880000, 0xb980081a,
 	0x00000000, 0xb8f8f804,
@@ -3711,464 +3711,385 @@ static const uint32_t cwsr_trap_gfx12_hex[] = {
 	0x807a817a, 0xbf0d997b,
 	0xbfa20002, 0x847a897a,
 	0xbfa00001, 0x847a8a7a,
-	0xb8fb1e06, 0x847b8a7b,
-	0x807a7b7a, 0x8b7bff7f,
-	0x0000ffff, 0x807aff7a,
-	0x00000200, 0x807a7e7a,
-	0x827b807b, 0xd7610000,
-	0x00010870, 0xd7610000,
-	0x00010a71, 0xd7610000,
-	0x00010c72, 0xd7610000,
-	0x00010e73, 0xd7610000,
-	0x00011074, 0xd7610000,
-	0x00011275, 0xd7610000,
-	0x00011476, 0xd7610000,
-	0x00011677, 0xd7610000,
-	0x00011a79, 0xd7610000,
-	0x00011c7e, 0xd7610000,
-	0x00011e7f, 0xd8500000,
-	0x00000000, 0xd8500000,
-	0x00000000, 0xd8500000,
-	0x00000000, 0xd8500000,
-	0x00000000, 0xd8500000,
-	0x00000000, 0xd8500000,
-	0x00000000, 0xd8500000,
-	0x00000000, 0xd8500000,
-	0x00000000, 0xbefe00ff,
-	0x00003fff, 0xbeff0080,
-	0xee0a407a, 0x000c0000,
-	0x00004000, 0xd760007a,
-	0x00011d00, 0xd760007b,
-	0x00011f00, 0xbefe007a,
-	0xbeff007b, 0xbef4007e,
-	0x8b75ff7f, 0x0000ffff,
-	0x8c75ff75, 0x00040000,
-	0xbef60080, 0xbef700ff,
-	0x10807fac, 0xbef1007d,
-	0xbef00080, 0xb8f30742,
-	0x84739973, 0xbefe00c1,
-	0x857d9973, 0x8b7d817d,
-	0xbf06817d, 0xbfa20002,
-	0xbeff0080, 0xbfa00002,
-	0xbeff00c1, 0xbfa0000c,
-	0xbef600ff, 0x01000000,
-	0xc4068070, 0x008ce801,
-	0x00008000, 0xc4068070,
-	0x008ce802, 0x00010000,
-	0xc4068070, 0x008ce803,
-	0x00018000, 0xbfa0000b,
-	0xbef600ff, 0x01000000,
-	0xc4068070, 0x008ce801,
-	0x00010000, 0xc4068070,
-	0x008ce802, 0x00020000,
-	0xc4068070, 0x008ce803,
-	0x00030000, 0xb8f03b05,
-	0x80708170, 0xbf0d9973,
-	0xbfa20002, 0x84708970,
-	0xbfa00001, 0x84708a70,
-	0xb8fa1e06, 0x847a8a7a,
-	0x80707a70, 0x8070ff70,
-	0x00000200, 0xbef600ff,
-	0x01000000, 0x7e000280,
-	0x7e020280, 0x7e040280,
-	0xbe804ec2, 0xbf94fffe,
-	0xb8faf804, 0x8b7a847a,
-	0x91788478, 0x8c787a78,
-	0x917aff6d, 0x80000000,
-	0xd7610002, 0x00010071,
-	0xd7610002, 0x0001026c,
-	0xd7610002, 0x0001047a,
-	0xd7610002, 0x0001066e,
-	0xd7610002, 0x0001086f,
-	0xd7610002, 0x00010a78,
-	0xd7610002, 0x00010e7b,
-	0xd8500000, 0x00000000,
-	0xd8500000, 0x00000000,
-	0xd8500000, 0x00000000,
-	0xd8500000, 0x00000000,
-	0xd8500000, 0x00000000,
-	0xd8500000, 0x00000000,
-	0xd8500000, 0x00000000,
-	0xd8500000, 0x00000000,
-	0xb8faf811, 0xd7610002,
-	0x00010c7a, 0xb8faf801,
-	0xd7610002, 0x0001107a,
-	0xb8faf814, 0xd7610002,
-	0x0001127a, 0xb8faf815,
-	0xd7610002, 0x0001147a,
-	0xb8faf812, 0xd7610002,
-	0x0001167a, 0xb8faf813,
-	0xd7610002, 0x0001187a,
-	0xb8faf802, 0xd7610002,
-	0x00011a7a, 0xbefa50c1,
-	0xbfc70000, 0xd7610002,
-	0x00011c7a, 0xd8500000,
-	0x00000000, 0xd8500000,
-	0x00000000, 0xd8500000,
-	0x00000000, 0xd8500000,
-	0x00000000, 0xd8500000,
-	0x00000000, 0xd8500000,
-	0x00000000, 0xd8500000,
-	0x00000000, 0xd8500000,
-	0x00000000, 0xbefe00ff,
-	0x0000ffff, 0xbeff0080,
-	0xc4068070, 0x008ce802,
-	0x00000000, 0xbefe00c1,
+	0x8b7bff7f, 0x0000ffff,
+	0x807aff7a, 0x00000240,
+	0x807a7e7a, 0x827b807b,
+	0xd7610000, 0x00010870,
+	0xd7610000, 0x00010a71,
+	0xd7610000, 0x00010c72,
+	0xd7610000, 0x00010e73,
+	0xd7610000, 0x00011074,
+	0xd7610000, 0x00011275,
+	0xd7610000, 0x00011476,
+	0xd7610000, 0x00011677,
+	0xd7610000, 0x00011a79,
+	0xd7610000, 0x00011c7e,
+	0xd7610000, 0x00011e7f,
+	0xbefe00ff, 0x00003fff,
+	0xbeff0080, 0xee0a407a,
+	0x000c0000, 0x00000000,
+	0xd760007a, 0x00011d00,
+	0xd760007b, 0x00011f00,
+	0xbefe007a, 0xbeff007b,
+	0xbef4007e, 0x8b75ff7f,
+	0x0000ffff, 0xbef1007d,
+	0xb8f30742, 0x84739973,
+	0xbefe00c1, 0x857d9973,
+	0x8b7d817d, 0xbf06817d,
+	0xbfa20002, 0xbeff0080,
+	0xbfa00002, 0xbeff00c1,
+	0xbfa0000a, 0xee0a4074,
+	0x008c0000, 0x00008000,
+	0xee0a4074, 0x010c0000,
+	0x00010000, 0xee0a4074,
+	0x018c0000, 0x00018000,
+	0xbfa00009, 0xee0a4074,
+	0x008c0000, 0x00010000,
+	0xee0a4074, 0x010c0000,
+	0x00020000, 0xee0a4074,
+	0x018c0000, 0x00030000,
 	0xb8f03b05, 0x80708170,
 	0xbf0d9973, 0xbfa20002,
 	0x84708970, 0xbfa00001,
-	0x84708a70, 0xb8fa1e06,
-	0x847a8a7a, 0x80707a70,
-	0xbef600ff, 0x01000000,
+	0x84708a70, 0x8070ff70,
+	0x00000200, 0x7e000280,
+	0x7e020280, 0x7e040280,
+	0xbefd0080, 0xbe804ec2,
+	0xbf94fffe, 0xb8faf804,
+	0x8b7a847a, 0x91788478,
+	0x8c787a78, 0xd7610002,
+	0x0000fa71, 0x807d817d,
+	0xd7610002, 0x0000fa6c,
+	0x807d817d, 0x917aff6d,
+	0x80000000, 0xd7610002,
+	0x0000fa7a, 0x807d817d,
+	0xd7610002, 0x0000fa6e,
+	0x807d817d, 0xd7610002,
+	0x0000fa6f, 0x807d817d,
+	0xd7610002, 0x0000fa78,
+	0x807d817d, 0xb8faf811,
+	0xd7610002, 0x0000fa7a,
+	0x807d817d, 0xbefa0080,
+	0xd7610002, 0x0000fa7a,
+	0x807d817d, 0xb8f1f801,
+	0xd7610002, 0x0000fa71,
+	0x807d817d, 0xb8f1f814,
+	0xd7610002, 0x0000fa71,
+	0x807d817d, 0xb8f1f815,
+	0xd7610002, 0x0000fa71,
+	0x807d817d, 0xb8f1f812,
+	0xd7610002, 0x0000fa71,
+	0x807d817d, 0xb8f1f813,
+	0xd7610002, 0x0000fa71,
+	0x807d817d, 0xb8faf802,
+	0xd7610002, 0x0000fa7a,
+	0x807d817d, 0xbefa50c1,
+	0xbfc70000, 0xd7610002,
+	0x0000fa7a, 0x807d817d,
+	0xbefe00ff, 0x0000ffff,
+	0xbeff0080, 0x80767074,
+	0x82778075, 0xee0a4076,
+	0x010c0000, 0x00000000,
+	0xbefe00c1, 0xb8f03b05,
+	0x80708170, 0xbf0d9973,
+	0xbfa20002, 0x84708970,
+	0xbfa00001, 0x84708a70,
 	0xbef90080, 0xbefd0080,
 	0xbf800000, 0xbe804100,
 	0xbe824102, 0xbe844104,
 	0xbe864106, 0xbe884108,
 	0xbe8a410a, 0xbe8c410c,
-	0xbe8e410e, 0xbf068079,
-	0xbfa10032, 0xd7610002,
-	0x00010000, 0xd7610002,
-	0x00010201, 0xd7610002,
-	0x00010402, 0xd7610002,
-	0x00010603, 0xd7610002,
-	0x00010804, 0xd7610002,
-	0x00010a05, 0xd7610002,
-	0x00010c06, 0xd7610002,
-	0x00010e07, 0xd7610002,
-	0x00011008, 0xd7610002,
-	0x00011209, 0xd7610002,
-	0x0001140a, 0xd7610002,
-	0x0001160b, 0xd7610002,
-	0x0001180c, 0xd7610002,
-	0x00011a0d, 0xd7610002,
-	0x00011c0e, 0xd7610002,
-	0x00011e0f, 0xd8500000,
-	0x00000000, 0xd8500000,
-	0x00000000, 0xd8500000,
-	0x00000000, 0xd8500000,
-	0x00000000, 0xd8500000,
-	0x00000000, 0xd8500000,
-	0x00000000, 0xd8500000,
-	0x00000000, 0xd8500000,
-	0x00000000, 0x80799079,
-	0xbfa00038, 0xd7610002,
-	0x00012000, 0xd7610002,
-	0x00012201, 0xd7610002,
-	0x00012402, 0xd7610002,
-	0x00012603, 0xd7610002,
-	0x00012804, 0xd7610002,
-	0x00012a05, 0xd7610002,
-	0x00012c06, 0xd7610002,
-	0x00012e07, 0xd7610002,
-	0x00013008, 0xd7610002,
-	0x00013209, 0xd7610002,
-	0x0001340a, 0xd7610002,
-	0x0001360b, 0xd7610002,
-	0x0001380c, 0xd7610002,
-	0x00013a0d, 0xd7610002,
-	0x00013c0e, 0xd7610002,
-	0x00013e0f, 0xd8500000,
-	0x00000000, 0xd8500000,
-	0x00000000, 0xd8500000,
-	0x00000000, 0xd8500000,
-	0x00000000, 0xd8500000,
-	0x00000000, 0xd8500000,
-	0x00000000, 0xd8500000,
-	0x00000000, 0xd8500000,
-	0x00000000, 0x80799079,
-	0xc4068070, 0x008ce802,
-	0x00000000, 0x8070ff70,
-	0x00000080, 0xbef90080,
-	0x7e040280, 0x807d907d,
-	0xbf0aff7d, 0x00000060,
-	0xbfa2ff88, 0xbe804100,
-	0xbe824102, 0xbe844104,
-	0xbe864106, 0xbe884108,
-	0xbe8a410a, 0xd7610002,
-	0x00010000, 0xd7610002,
-	0x00010201, 0xd7610002,
-	0x00010402, 0xd7610002,
-	0x00010603, 0xd7610002,
-	0x00010804, 0xd7610002,
-	0x00010a05, 0xd7610002,
-	0x00010c06, 0xd7610002,
-	0x00010e07, 0xd7610002,
-	0x00011008, 0xd7610002,
-	0x00011209, 0xd7610002,
-	0x0001140a, 0xd7610002,
-	0x0001160b, 0xd8500000,
-	0x00000000, 0xd8500000,
-	0x00000000, 0xd8500000,
-	0x00000000, 0xd8500000,
-	0x00000000, 0xd8500000,
-	0x00000000, 0xd8500000,
-	0x00000000, 0xd8500000,
-	0x00000000, 0xd8500000,
-	0x00000000, 0xc4068070,
-	0x008ce802, 0x00000000,
-	0xbefe00c1, 0x857d9973,
-	0x8b7d817d, 0xbf06817d,
-	0xbfa20002, 0xbeff0080,
-	0xbfa00001, 0xbeff00c1,
-	0xb8fb4306, 0x8b7bc17b,
-	0xbfa10044, 0x8b7aff6d,
-	0x80000000, 0xbfa10041,
-	0x847b897b, 0xbef6007b,
+	0xbe8e410e, 0xd7610002,
+	0x0000f200, 0x80798179,
+	0xd7610002, 0x0000f201,
+	0x80798179, 0xd7610002,
+	0x0000f202, 0x80798179,
+	0xd7610002, 0x0000f203,
+	0x80798179, 0xd7610002,
+	0x0000f204, 0x80798179,
+	0xd7610002, 0x0000f205,
+	0x80798179, 0xd7610002,
+	0x0000f206, 0x80798179,
+	0xd7610002, 0x0000f207,
+	0x80798179, 0xd7610002,
+	0x0000f208, 0x80798179,
+	0xd7610002, 0x0000f209,
+	0x80798179, 0xd7610002,
+	0x0000f20a, 0x80798179,
+	0xd7610002, 0x0000f20b,
+	0x80798179, 0xd7610002,
+	0x0000f20c, 0x80798179,
+	0xd7610002, 0x0000f20d,
+	0x80798179, 0xd7610002,
+	0x0000f20e, 0x80798179,
+	0xd7610002, 0x0000f20f,
+	0x80798179, 0xbf06a079,
+	0xbfa10009, 0x80767074,
+	0x82778075, 0xee0a4076,
+	0x010c0000, 0x00000000,
+	0x8070ff70, 0x00000080,
+	0xbef90080, 0x7e040280,
+	0x807d907d, 0xbf0aff7d,
+	0x00000060, 0xbfa2ffb9,
+	0xbe804100, 0xbe824102,
+	0xbe844104, 0xbe864106,
+	0xbe884108, 0xbe8a410a,
+	0xd7610002, 0x0000f200,
+	0x80798179, 0xd7610002,
+	0x0000f201, 0x80798179,
+	0xd7610002, 0x0000f202,
+	0x80798179, 0xd7610002,
+	0x0000f203, 0x80798179,
+	0xd7610002, 0x0000f204,
+	0x80798179, 0xd7610002,
+	0x0000f205, 0x80798179,
+	0xd7610002, 0x0000f206,
+	0x80798179, 0xd7610002,
+	0x0000f207, 0x80798179,
+	0xd7610002, 0x0000f208,
+	0x80798179, 0xd7610002,
+	0x0000f209, 0x80798179,
+	0xd7610002, 0x0000f20a,
+	0x80798179, 0xd7610002,
+	0x0000f20b, 0x80798179,
+	0x80767074, 0x82778075,
+	0xee0a4076, 0x010c0000,
+	0x00000000, 0xbefe00c1,
+	0x857d9973, 0x8b7d817d,
+	0xbf06817d, 0xbfa20002,
+	0xbeff0080, 0xbfa00001,
+	0xbeff00c1, 0xb8fb4306,
+	0x8b7bc17b, 0xbfa10042,
+	0x8b7aff6d, 0x80000000,
+	0xbfa1003f, 0x847b897b,
 	0xb8f03b05, 0x80708170,
 	0xbf0d9973, 0xbfa20002,
 	0x84708970, 0xbfa00001,
-	0x84708a70, 0xb8fa1e06,
-	0x847a8a7a, 0x80707a70,
-	0x8070ff70, 0x00000200,
-	0x8070ff70, 0x00000080,
-	0xbef600ff, 0x01000000,
-	0xd71f0000, 0x000100c1,
-	0xd7200000, 0x000200c1,
-	0x16000084, 0x857d9973,
-	0x8b7d817d, 0xbf06817d,
-	0xbefd0080, 0xbfa20013,
-	0xbe8300ff, 0x00000080,
+	0x84708a70, 0x8070ff70,
+	0x00000200, 0x8070ff70,
+	0x00000080, 0xd71f0000,
+	0x000100c1, 0xd7200000,
+	0x000200c1, 0x16000084,
+	0x857d9973, 0x8b7d817d,
+	0xbf06817d, 0xbefd0080,
+	0xbfa20015, 0xbe8300ff,
+	0x00000080, 0xbf800000,
+	0xbf800000, 0xbf800000,
+	0xd8d80000, 0x01000000,
+	0xbf8a0000, 0x80767074,
+	0x82778075, 0xee0a4076,
+	0x008c0000, 0x00000000,
+	0x807d037d, 0x80700370,
+	0xd5250000, 0x0001ff00,
+	0x00000080, 0xbf0a7b7d,
+	0xbfa2fff1, 0xbfa00014,
+	0xbe8300ff, 0x00000100,
 	0xbf800000, 0xbf800000,
 	0xbf800000, 0xd8d80000,
 	0x01000000, 0xbf8a0000,
-	0xc4068070, 0x008ce801,
+	0x80767074, 0x82778075,
+	0xee0a4076, 0x008c0000,
 	0x00000000, 0x807d037d,
 	0x80700370, 0xd5250000,
-	0x0001ff00, 0x00000080,
-	0xbf0a7b7d, 0xbfa2fff3,
-	0xbfa00012, 0xbe8300ff,
-	0x00000100, 0xbf800000,
-	0xbf800000, 0xbf800000,
-	0xd8d80000, 0x01000000,
-	0xbf8a0000, 0xc4068070,
-	0x008ce801, 0x00000000,
-	0x807d037d, 0x80700370,
-	0xd5250000, 0x0001ff00,
-	0x00000100, 0xbf0a7b7d,
-	0xbfa2fff3, 0xbefe00c1,
-	0x857d9973, 0x8b7d817d,
-	0xbf06817d, 0xbfa20004,
-	0xbef000ff, 0x00000200,
-	0xbeff0080, 0xbfa00003,
-	0xbef000ff, 0x00000400,
-	0xbeff00c1, 0xb8fb3b05,
-	0x807b817b, 0x847b827b,
-	0x857d9973, 0x8b7d817d,
-	0xbf06817d, 0xbfa2001b,
-	0xbef600ff, 0x01000000,
-	0xbefd0084, 0xbf0a7b7d,
-	0xbfa10040, 0x7e008700,
-	0x7e028701, 0x7e048702,
-	0x7e068703, 0xc4068070,
-	0x008ce800, 0x00000000,
-	0xc4068070, 0x008ce801,
-	0x00008000, 0xc4068070,
-	0x008ce802, 0x00010000,
-	0xc4068070, 0x008ce803,
-	0x00018000, 0x807d847d,
-	0x8070ff70, 0x00000200,
-	0xbf0a7b7d, 0xbfa2ffeb,
-	0xbfa0002a, 0xbef600ff,
-	0x01000000, 0xbefd0084,
-	0xbf0a7b7d, 0xbfa10015,
+	0x0001ff00, 0x00000100,
+	0xbf0a7b7d, 0xbfa2fff1,
+	0xbefe00c1, 0x857d9973,
+	0x8b7d817d, 0xbf06817d,
+	0xbfa20004, 0xbef000ff,
+	0x00000200, 0xbeff0080,
+	0xbfa00003, 0xbef000ff,
+	0x00000400, 0xbeff00c1,
+	0xb8fb3b05, 0x807b817b,
+	0x847b827b, 0x857d9973,
+	0x8b7d817d, 0xbf06817d,
+	0xbfa2001b, 0xbefd0084,
+	0xbf0a7b7d, 0xbfa10032,
 	0x7e008700, 0x7e028701,
 	0x7e048702, 0x7e068703,
-	0xc4068070, 0x008ce800,
-	0x00000000, 0xc4068070,
-	0x008ce801, 0x00010000,
-	0xc4068070, 0x008ce802,
-	0x00020000, 0xc4068070,
-	0x008ce803, 0x00030000,
+	0x80767074, 0x82778075,
+	0xee0a4076, 0x000c0000,
+	0x00000000, 0xee0a4076,
+	0x008c0000, 0x00008000,
+	0xee0a4076, 0x010c0000,
+	0x00010000, 0xee0a4076,
+	0x018c0000, 0x00018000,
 	0x807d847d, 0x8070ff70,
-	0x00000400, 0xbf0a7b7d,
-	0xbfa2ffeb, 0xb8fb1e06,
-	0x8b7bc17b, 0xbfa1000d,
-	0x847b837b, 0x807b7d7b,
-	0xbefe00c1, 0xbeff0080,
-	0x7e008700, 0xc4068070,
-	0x008ce800, 0x00000000,
-	0x807d817d, 0x8070ff70,
-	0x00000080, 0xbf0a7b7d,
-	0xbfa2fff7, 0xbfa00171,
-	0xbef4007e, 0x8b75ff7f,
-	0x0000ffff, 0x8c75ff75,
-	0x00040000, 0xbef60080,
-	0xbef700ff, 0x10807fac,
+	0x00000200, 0xbf0a7b7d,
+	0xbfa2ffe9, 0xbfa0001a,
+	0xbefd0084, 0xbf0a7b7d,
+	0xbfa10017, 0x7e008700,
+	0x7e028701, 0x7e048702,
+	0x7e068703, 0x80767074,
+	0x82778075, 0xee0a4076,
+	0x000c0000, 0x00000000,
+	0xee0a4076, 0x008c0000,
+	0x00010000, 0xee0a4076,
+	0x010c0000, 0x00020000,
+	0xee0a4076, 0x018c0000,
+	0x00030000, 0x807d847d,
+	0x8070ff70, 0x00000400,
+	0xbf0a7b7d, 0xbfa2ffe9,
+	0xbfa0014c, 0xbef4007e,
+	0x8b75ff7f, 0x0000ffff,
 	0xbef1007f, 0xb8f20742,
 	0x84729972, 0x8b6eff7f,
-	0x04000000, 0xbfa1003b,
-	0xbefe00c1, 0x857d9972,
-	0x8b7d817d, 0xbf06817d,
-	0xbfa20002, 0xbeff0080,
-	0xbfa00001, 0xbeff00c1,
-	0xb8ef4306, 0x8b6fc16f,
-	0xbfa10030, 0x846f896f,
-	0xbef6006f, 0xb8f83b05,
-	0x80788178, 0xbf0d9972,
-	0xbfa20002, 0x84788978,
-	0xbfa00001, 0x84788a78,
-	0xb8ee1e06, 0x846e8a6e,
-	0x80786e78, 0x8078ff78,
-	0x00000200, 0x8078ff78,
-	0x00000080, 0xbef600ff,
-	0x01000000, 0x857d9972,
-	0x8b7d817d, 0xbf06817d,
-	0xbefd0080, 0xbfa2000d,
-	0xc4050078, 0x0080e800,
-	0x00000000, 0xbf8a0000,
-	0xdac00000, 0x00000000,
-	0x807dff7d, 0x00000080,
-	0x8078ff78, 0x00000080,
-	0xbf0a6f7d, 0xbfa2fff4,
-	0xbfa0000c, 0xc4050078,
-	0x0080e800, 0x00000000,
-	0xbf8a0000, 0xdac00000,
-	0x00000000, 0x807dff7d,
-	0x00000100, 0x8078ff78,
-	0x00000100, 0xbf0a6f7d,
-	0xbfa2fff4, 0xbef80080,
+	0x04000000, 0xbfa10044,
 	0xbefe00c1, 0x857d9972,
 	0x8b7d817d, 0xbf06817d,
-	0xbfa20002, 0xbeff0080,
-	0xbfa00001, 0xbeff00c1,
-	0xb8ef3b05, 0x806f816f,
-	0x846f826f, 0x857d9972,
-	0x8b7d817d, 0xbf06817d,
-	0xbfa2002c, 0xbef600ff,
-	0x01000000, 0xbeee0078,
-	0x8078ff78, 0x00000200,
-	0xbefd0084, 0xbf0a6f7d,
-	0xbfa10061, 0xc4050078,
-	0x008ce800, 0x00000000,
-	0xc4050078, 0x008ce801,
-	0x00008000, 0xc4050078,
-	0x008ce802, 0x00010000,
-	0xc4050078, 0x008ce803,
-	0x00018000, 0xbf8a0000,
-	0x7e008500, 0x7e028501,
-	0x7e048502, 0x7e068503,
-	0x807d847d, 0x8078ff78,
-	0x00000200, 0xbf0a6f7d,
-	0xbfa2ffea, 0xc405006e,
-	0x008ce800, 0x00000000,
-	0xc405006e, 0x008ce801,
-	0x00008000, 0xc405006e,
-	0x008ce802, 0x00010000,
-	0xc405006e, 0x008ce803,
-	0x00018000, 0xbf8a0000,
-	0xbfa0003d, 0xbef600ff,
-	0x01000000, 0xbeee0078,
-	0x8078ff78, 0x00000400,
-	0xbefd0084, 0xbf0a6f7d,
-	0xbfa10016, 0xc4050078,
-	0x008ce800, 0x00000000,
-	0xc4050078, 0x008ce801,
-	0x00010000, 0xc4050078,
-	0x008ce802, 0x00020000,
-	0xc4050078, 0x008ce803,
-	0x00030000, 0xbf8a0000,
-	0x7e008500, 0x7e028501,
-	0x7e048502, 0x7e068503,
-	0x807d847d, 0x8078ff78,
-	0x00000400, 0xbf0a6f7d,
-	0xbfa2ffea, 0xb8ef1e06,
-	0x8b6fc16f, 0xbfa1000f,
-	0x846f836f, 0x806f7d6f,
-	0xbefe00c1, 0xbeff0080,
-	0xc4050078, 0x008ce800,
-	0x00000000, 0xbf8a0000,
-	0x7e008500, 0x807d817d,
-	0x8078ff78, 0x00000080,
-	0xbf0a6f7d, 0xbfa2fff6,
-	0xbeff00c1, 0xc405006e,
-	0x008ce800, 0x00000000,
-	0xc405006e, 0x008ce801,
-	0x00010000, 0xc405006e,
-	0x008ce802, 0x00020000,
-	0xc405006e, 0x008ce803,
-	0x00030000, 0xbf8a0000,
+	0xbfa20002, 0xbeff0080,
+	0xbfa00001, 0xbeff00c1,
+	0xb8ef4306, 0x8b6fc16f,
+	0xbfa10039, 0x846f896f,
 	0xb8f83b05, 0x80788178,
 	0xbf0d9972, 0xbfa20002,
 	0x84788978, 0xbfa00001,
-	0x84788a78, 0xb8ee1e06,
-	0x846e8a6e, 0x80786e78,
+	0x84788a78, 0x8078ff78,
+	0x00000200, 0x8078ff78,
+	0x00000080, 0x857d9972,
+	0x8b7d817d, 0xbf06817d,
+	0xbefd0080, 0xd71f0001,
+	0x000100c1, 0xd7200001,
+	0x000202c1, 0x30020282,
+	0xbfa20012, 0x80767874,
+	0x82778075, 0xee0a0076,
+	0x000c0000, 0x00000000,
+	0xbf8a0000, 0xd8340000,
+	0x00000001, 0xd5250001,
+	0x0001ff01, 0x00000080,
+	0x807dff7d, 0x00000080,
+	0x8078ff78, 0x00000080,
+	0xbf0a6f7d, 0xbfa2ffef,
+	0xbfa00011, 0x80767874,
+	0x82778075, 0xee0a0076,
+	0x000c0000, 0x00000000,
+	0xbf8a0000, 0xd8340000,
+	0x00000001, 0xd5250001,
+	0x0001ff01, 0x00000100,
+	0x807dff7d, 0x00000100,
+	0x8078ff78, 0x00000100,
+	0xbf0a6f7d, 0xbfa2ffef,
+	0xbef80080, 0xbefe00c1,
+	0x857d9972, 0x8b7d817d,
+	0xbf06817d, 0xbfa20002,
+	0xbeff0080, 0xbfa00001,
+	0xbeff00c1, 0xb8ef3b05,
+	0x806f816f, 0x846f826f,
+	0x857d9972, 0x8b7d817d,
+	0xbf06817d, 0xbfa2002c,
+	0xbeee0078, 0x8078ff78,
+	0x00000200, 0xbefd0084,
+	0x80767874, 0x82778075,
+	0xee0a0076, 0x000c0000,
+	0x00000000, 0xee0a0076,
+	0x000c0001, 0x00008000,
+	0xee0a0076, 0x000c0002,
+	0x00010000, 0xee0a0076,
+	0x000c0003, 0x00018000,
+	0xbf8a0000, 0x7e008500,
+	0x7e028501, 0x7e048502,
+	0x7e068503, 0x807d847d,
 	0x8078ff78, 0x00000200,
-	0x80f8ff78, 0x00000050,
-	0xbef600ff, 0x01000000,
+	0xbf0a6f7d, 0xbfa2ffe8,
+	0x80766e74, 0x82778075,
+	0xee0a0076, 0x000c0000,
+	0x00000000, 0xee0a0076,
+	0x000c0001, 0x00008000,
+	0xee0a0076, 0x000c0002,
+	0x00010000, 0xee0a0076,
+	0x000c0003, 0x00018000,
+	0xbf8a0000, 0xbfa0002d,
+	0xbeee0078, 0x8078ff78,
+	0x00000400, 0xbefd0084,
+	0xbf0a6f7d, 0xbfa10018,
+	0x80767874, 0x82778075,
+	0xee0a0076, 0x000c0000,
+	0x00000000, 0xee0a0076,
+	0x000c0001, 0x00010000,
+	0xee0a0076, 0x000c0002,
+	0x00020000, 0xee0a0076,
+	0x000c0003, 0x00030000,
+	0xbf8a0000, 0x7e008500,
+	0x7e028501, 0x7e048502,
+	0x7e068503, 0x807d847d,
+	0x8078ff78, 0x00000400,
+	0xbf0a6f7d, 0xbfa2ffe8,
+	0x80766e74, 0x82778075,
+	0xee0a0076, 0x000c0000,
+	0x00000000, 0xee0a0076,
+	0x000c0001, 0x00010000,
+	0xee0a0076, 0x000c0002,
+	0x00020000, 0xee0a0076,
+	0x000c0003, 0x00030000,
+	0xbf8a0000, 0xb8f83b05,
+	0x80788178, 0xbf0d9972,
+	0xbfa20002, 0x84788978,
+	0xbfa00001, 0x84788a78,
+	0x8078ff78, 0x00000200,
+	0x80f8ff78, 0x00000060,
+	0x80767874, 0x82778075,
 	0xbefd00ff, 0x0000006c,
-	0x80f89078, 0xf462403a,
-	0xf0000000, 0xbf8a0000,
-	0x80fd847d, 0xbf800000,
-	0xbe804300, 0xbe824302,
-	0x80f8a078, 0xf462603a,
-	0xf0000000, 0xbf8a0000,
+	0xf460403b, 0xf8000000,
+	0xbf8a0000, 0x80fd847d,
+	0xbf800000, 0xbe804300,
+	0xbe824302, 0x80f6a076,
+	0x82f78077, 0xf460603b,
+	0xf8000000, 0xbf8a0000,
 	0x80fd887d, 0xbf800000,
 	0xbe804300, 0xbe824302,
 	0xbe844304, 0xbe864306,
-	0x80f8c078, 0xf462803a,
-	0xf0000000, 0xbf8a0000,
-	0x80fd907d, 0xbf800000,
-	0xbe804300, 0xbe824302,
-	0xbe844304, 0xbe864306,
-	0xbe884308, 0xbe8a430a,
-	0xbe8c430c, 0xbe8e430e,
-	0xbf06807d, 0xbfa1fff0,
-	0xb980f801, 0x00000000,
-	0xb8f83b05, 0x80788178,
-	0xbf0d9972, 0xbfa20002,
-	0x84788978, 0xbfa00001,
-	0x84788a78, 0xb8ee1e06,
-	0x846e8a6e, 0x80786e78,
+	0x80f6c076, 0x82f78077,
+	0xf460803b, 0xf8000000,
+	0xbf8a0000, 0x80fd907d,
+	0xbf800000, 0xbe804300,
+	0xbe824302, 0xbe844304,
+	0xbe864306, 0xbe884308,
+	0xbe8a430a, 0xbe8c430c,
+	0xbe8e430e, 0xbf06807d,
+	0xbfa1ffef, 0xb980f801,
+	0x00000000, 0xb8f83b05,
+	0x80788178, 0xbf0d9972,
+	0xbfa20002, 0x84788978,
+	0xbfa00001, 0x84788a78,
 	0x8078ff78, 0x00000200,
-	0xbef600ff, 0x01000000,
-	0xbeff0071, 0xf4621bfa,
-	0xf0000000, 0x80788478,
-	0xf4621b3a, 0xf0000000,
-	0x80788478, 0xf4621b7a,
-	0xf0000000, 0x80788478,
-	0xf4621c3a, 0xf0000000,
-	0x80788478, 0xf4621c7a,
-	0xf0000000, 0x80788478,
-	0xf4621eba, 0xf0000000,
-	0x80788478, 0xf4621efa,
-	0xf0000000, 0x80788478,
-	0xf4621e7a, 0xf0000000,
-	0x80788478, 0xf4621cfa,
-	0xf0000000, 0x80788478,
-	0xf4621bba, 0xf0000000,
-	0x80788478, 0xbf8a0000,
-	0xb96ef814, 0xf4621bba,
-	0xf0000000, 0x80788478,
-	0xbf8a0000, 0xb96ef815,
-	0xf4621bba, 0xf0000000,
-	0x80788478, 0xbf8a0000,
-	0xb96ef812, 0xf4621bba,
-	0xf0000000, 0x80788478,
-	0xbf8a0000, 0xb96ef813,
-	0x8b6eff7f, 0x04000000,
-	0xbfa1000d, 0x80788478,
-	0xf4621bba, 0xf0000000,
-	0x80788478, 0xbf8a0000,
-	0xbf0d806e, 0xbfa10006,
-	0x856e906e, 0x8b6e6e6e,
-	0xbfa10003, 0xbe804ec1,
-	0x816ec16e, 0xbfa0fffb,
-	0xbefd006f, 0xbefe0070,
-	0xbeff0071, 0xb97b2011,
-	0x857b867b, 0xb97b0191,
-	0x857b827b, 0xb97bba11,
-	0xb973f801, 0xb8ee3b05,
-	0x806e816e, 0xbf0d9972,
-	0xbfa20002, 0x846e896e,
-	0xbfa00001, 0x846e8a6e,
-	0xb8ef1e06, 0x846f8a6f,
-	0x806e6f6e, 0x806eff6e,
-	0x00000200, 0x806e746e,
-	0x826f8075, 0x8b6fff6f,
-	0x0000ffff, 0xf4605c37,
-	0xf8000050, 0xf4605d37,
-	0xf8000060, 0xf4601e77,
-	0xf8000074, 0xbf8a0000,
+	0x80767874, 0x82778075,
+	0xbeff0071, 0xf4601bfb,
+	0xf8000000, 0xf4601b3b,
+	0xf8000004, 0xf4601b7b,
+	0xf8000008, 0xf4601c3b,
+	0xf800000c, 0xf4601c7b,
+	0xf8000010, 0xf4601ebb,
+	0xf8000014, 0xf4601efb,
+	0xf8000018, 0xf4601e7b,
+	0xf800001c, 0xf4601cfb,
+	0xf8000020, 0xf4601bbb,
+	0xf8000024, 0xbf8a0000,
+	0xb96ef814, 0xf4601bbb,
+	0xf8000028, 0xbf8a0000,
+	0xb96ef815, 0xf4601bbb,
+	0xf800002c, 0xbf8a0000,
+	0xb96ef812, 0xf4601bbb,
+	0xf8000030, 0xbf8a0000,
+	0xb96ef813, 0x8b6eff7f,
+	0x04000000, 0xbfa1000b,
+	0xf4601bbb, 0xf8000038,
+	0xbf8a0000, 0xbf0d806e,
+	0xbfa10006, 0x856e906e,
+	0x8b6e6e6e, 0xbfa10003,
+	0xbe804ec1, 0x816ec16e,
+	0xbfa0fffb, 0xbefd006f,
+	0xbefe0070, 0xbeff0071,
+	0xb97b2011, 0x857b867b,
+	0xb97b0191, 0x857b827b,
+	0xb97bba11, 0xb973f801,
+	0xb8ee3b05, 0x806e816e,
+	0xbf0d9972, 0xbfa20002,
+	0x846e896e, 0xbfa00001,
+	0x846e8a6e, 0x806eff6e,
+	0x00000240, 0x806e746e,
+	0x826f8075, 0xf4605c37,
+	0xf8000010, 0xf4605d37,
+	0xf8000020, 0xf4601e77,
+	0xf8000034, 0xbf8a0000,
 	0x8b6dff6d, 0x0000ffff,
 	0x8bfe7e7e, 0x8bea6a6a,
 	0x936eff77, 0x0002001a,
@@ -4666,14 +4587,18 @@ static const uint32_t cwsr_trap_gfx9_5_0_hex[] = {
 };
 
 static const uint32_t cwsr_trap_gfx12_1_0_hex[] = {
-	0xbfa00001, 0xbfa003b7,
-	0xb0804009, 0xb8f8f804,
+	0xbfa00001, 0xbfa003aa,
+	0xb0804009, 0xb8eef81a,
+	0xbf880000, 0xb980081a,
+	0x00000000, 0xb8f8f804,
+	0x9177ff77, 0x0c000000,
+	0x846e9a6e, 0x8c776e77,
 	0x9178ff78, 0x00008c00,
 	0xb8fbf811, 0x8b6eff78,
 	0x00004000, 0xbfa10008,
 	0x8b6eff7b, 0x00000080,
 	0xbfa20018, 0x8b6ea07b,
-	0xbfa200e1, 0xbf830010,
+	0xbfa200d4, 0xbf830010,
 	0xb8fbf811, 0xbfa0fffb,
 	0x8b6eff7b, 0x00000bd0,
 	0xbfa20010, 0xb8eef812,
@@ -4684,7 +4609,7 @@ static const uint32_t cwsr_trap_gfx12_1_0_hex[] = {
 	0xf0000000, 0xbfa20005,
 	0x8b6fff6f, 0x00000200,
 	0xbfa20002, 0x8b6ea07b,
-	0xbfa200cb, 0x9177ff77,
+	0xbfa200be, 0x9177ff77,
 	0x007fc000, 0xb8fa04a1,
 	0x847a967a, 0x8c777a77,
 	0xb8fa0421, 0x847a957a,
@@ -4777,263 +4702,230 @@ static const uint32_t cwsr_trap_gfx12_1_0_hex[] = {
 	0xb97a0421, 0x857a8e77,
 	0xb97a3021, 0x8bfe7e7e,
 	0x8bea6a6a, 0x85788978,
-	0xb9783244, 0xbe804a6c,
-	0xb8faf802, 0xbf0d987a,
-	0xbfa10001, 0xbfb00000,
-	0x8b6dff6d, 0x01ffffff,
-	0xbefa0080, 0xb97a0151,
-	0x9177ff77, 0x007fc000,
-	0xb8fa04a1, 0x847a967a,
-	0x8c777a77, 0xb8fa0421,
-	0x847a957a, 0x8c777a77,
-	0xb8fa3021, 0x847a8e7a,
-	0x8c777a77, 0xb980f821,
-	0x00000000, 0xbf0d847b,
-	0xbfa20078, 0xf4003eb6,
-	0xf8000000, 0xf4003bb6,
-	0xf8000008, 0xbfc70001,
-	0x8b76ff7a, 0x80000000,
-	0xbfa20027, 0x9376ff7a,
-	0x00060019, 0x81f9a376,
-	0xbf0b8179, 0xbfa20068,
-	0x81f9ac76, 0xbf0b8179,
-	0xbfa20062, 0x81f9b776,
-	0xbf0b8179, 0xbfa2005f,
-	0x8b76ff7a, 0x000001ff,
-	0xbf06ff76, 0x000000fe,
-	0xbfa2005d, 0xbf06ff76,
-	0x000000ff, 0xbfa20057,
-	0xbf06ff76, 0x000000fa,
-	0xbfa20054, 0x81f9ff76,
-	0x000000e9, 0xbf0b8179,
-	0xbfa20050, 0x8b76ff7b,
-	0xffff0000, 0xbf06ff76,
-	0xbf860000, 0xbfa10051,
-	0x9376ff7b, 0x0002000e,
-	0x8b79ff7b, 0x00003f00,
-	0x85798679, 0x8c767976,
-	0xb9763b01, 0xbfa00049,
-	0x8b76ff7a, 0xfc000000,
-	0xbf06ff76, 0xd4000000,
-	0xbfa20013, 0xbf06ff76,
-	0xc8000000, 0xbfa20027,
-	0x8b76ff7a, 0xff000000,
-	0xbf06ff76, 0xcf000000,
-	0xbfa20039, 0x8b79ff7a,
-	0xffff0000, 0xbf06ff79,
-	0xcc350000, 0xbfa20037,
-	0xbf06ff79, 0xcc3a0000,
-	0xbfa20034, 0xbf06ff76,
-	0xcc000000, 0xbfa10031,
-	0x8b76ff7b, 0x000001ff,
+	0x936eff77, 0x0002001a,
+	0xb96ef81a, 0xb9783244,
+	0xbe804a6c, 0xb8faf802,
+	0xbf0d987a, 0xbfa10001,
+	0xbfb00000, 0x8b6dff6d,
+	0x01ffffff, 0xbefa0080,
+	0xb97a0151, 0x9177ff77,
+	0x007fc000, 0xb8fa04a1,
+	0x847a967a, 0x8c777a77,
+	0xb8fa0421, 0x847a957a,
+	0x8c777a77, 0xb8fa3021,
+	0x847a8e7a, 0x8c777a77,
+	0xb980f821, 0x00000000,
+	0xbf0d847b, 0xbfa20078,
+	0xf4003eb6, 0xf8000000,
+	0xf4003bb6, 0xf8000008,
+	0xbfc70001, 0x8b76ff7a,
+	0x80000000, 0xbfa20027,
+	0x9376ff7a, 0x00060019,
+	0x81f9a376, 0xbf0b8179,
+	0xbfa20068, 0x81f9ac76,
+	0xbf0b8179, 0xbfa20062,
+	0x81f9b776, 0xbf0b8179,
+	0xbfa2005f, 0x8b76ff7a,
+	0x000001ff, 0xbf06ff76,
+	0x000000fe, 0xbfa2005d,
 	0xbf06ff76, 0x000000ff,
-	0xbfa20029, 0xbf06ff76,
-	0x000000fa, 0xbfa20026,
-	0x81f6ff76, 0x000000e9,
-	0xbf0b8176, 0xbfa20022,
-	0x8b76ff7b, 0x0003fe00,
-	0xbf06ff76, 0x0001fe00,
-	0xbfa2001d, 0x8b76ff7b,
-	0x07fc0000, 0xbf06ff76,
-	0x03fc0000, 0xbfa20018,
-	0xbfa00014, 0x9376ff7a,
-	0x00040016, 0x81f68176,
-	0xbf0b8176, 0xbfa20012,
-	0x9376ff7a, 0x00050011,
-	0x81f68176, 0xbf0b8176,
-	0xbfa2000d, 0x8b76ff7a,
+	0xbfa20057, 0xbf06ff76,
+	0x000000fa, 0xbfa20054,
+	0x81f9ff76, 0x000000e9,
+	0xbf0b8179, 0xbfa20050,
+	0x8b76ff7b, 0xffff0000,
+	0xbf06ff76, 0xbf860000,
+	0xbfa10051, 0x9376ff7b,
+	0x0002000e, 0x8b79ff7b,
+	0x00003f00, 0x85798679,
+	0x8c767976, 0xb9763b01,
+	0xbfa00049, 0x8b76ff7a,
+	0xfc000000, 0xbf06ff76,
+	0xd4000000, 0xbfa20013,
+	0xbf06ff76, 0xc8000000,
+	0xbfa20027, 0x8b76ff7a,
+	0xff000000, 0xbf06ff76,
+	0xcf000000, 0xbfa20039,
+	0x8b79ff7a, 0xffff0000,
+	0xbf06ff79, 0xcc350000,
+	0xbfa20037, 0xbf06ff79,
+	0xcc3a0000, 0xbfa20034,
+	0xbf06ff76, 0xcc000000,
+	0xbfa10031, 0x8b76ff7b,
 	0x000001ff, 0xbf06ff76,
-	0x000000ff, 0xbfa20008,
-	0x8b76ff7b, 0x000001ff,
+	0x000000ff, 0xbfa20029,
+	0xbf06ff76, 0x000000fa,
+	0xbfa20026, 0x81f6ff76,
+	0x000000e9, 0xbf0b8176,
+	0xbfa20022, 0x8b76ff7b,
+	0x0003fe00, 0xbf06ff76,
+	0x0001fe00, 0xbfa2001d,
+	0x8b76ff7b, 0x07fc0000,
+	0xbf06ff76, 0x03fc0000,
+	0xbfa20018, 0xbfa00014,
+	0x9376ff7a, 0x00040016,
+	0x81f68176, 0xbf0b8176,
+	0xbfa20012, 0x9376ff7a,
+	0x00050011, 0x81f68176,
+	0xbf0b8176, 0xbfa2000d,
+	0x8b76ff7a, 0x000001ff,
 	0xbf06ff76, 0x000000ff,
-	0xbfa20003, 0xbfc70000,
-	0xbefb006e, 0xbfa0ffad,
-	0xbfc70000, 0xbefb006f,
-	0xbfa0ffaa, 0xbfc70000,
-	0xbeee007e, 0xbeef007f,
-	0xbefe0180, 0xbefe4d84,
-	0xbf8a0000, 0x8b7aff7f,
-	0x04000000, 0x847a857a,
-	0x8c6d7a6d, 0xb8eff822,
-	0xb980f822, 0x00000000,
-	0xb8fa2b01, 0x847a997a,
-	0x8c6d7a6d, 0xbefa0080,
-	0xb97a2b01, 0xbefa007e,
+	0xbfa20008, 0x8b76ff7b,
+	0x000001ff, 0xbf06ff76,
+	0x000000ff, 0xbfa20003,
+	0xbfc70000, 0xbefb006e,
+	0xbfa0ffad, 0xbfc70000,
+	0xbefb006f, 0xbfa0ffaa,
+	0xbfc70000, 0xbeee007e,
+	0xbeef007f, 0xbefe0180,
+	0xbefe4d84, 0xbf8a0000,
+	0x8b7aff7f, 0x04000000,
+	0x847a857a, 0x8c6d7a6d,
+	0xb8eff822, 0xb980f822,
+	0x00000000, 0xb8fa2b01,
+	0x847a997a, 0x8c6d7a6d,
+	0xbefa0080, 0xb97a2b01,
+	0xbefa007e, 0x8b7bff7f,
+	0x01ffffff, 0xbefe00c1,
+	0xbeff00c1, 0xee0a407a,
+	0x000c0000, 0x00000000,
+	0x7e000280, 0xbefe007a,
+	0xbeff007b, 0xb8fb0742,
+	0x847b997b, 0xb8fa3b05,
+	0x807a817a, 0xbf0d997b,
+	0xbfa20002, 0x847a897a,
+	0xbfa00001, 0x847a8a7a,
 	0x8b7bff7f, 0x01ffffff,
-	0xbefe00c1, 0xbeff00c1,
-	0xee0a407a, 0x000c0000,
-	0x00000000, 0x7e000280,
+	0x807aff7a, 0x000001c0,
+	0x807a7e7a, 0x827b807b,
+	0xd7610000, 0x00010870,
+	0xd7610000, 0x00010a71,
+	0xd7610000, 0x00010c72,
+	0xd7610000, 0x00010e73,
+	0xd7610000, 0x00011074,
+	0xd7610000, 0x00011275,
+	0xd7610000, 0x00011476,
+	0xd7610000, 0x00011677,
+	0xd7610000, 0x00011a79,
+	0xd7610000, 0x00011c7e,
+	0xd7610000, 0x00011e7f,
+	0xbefe00ff, 0x00003fff,
+	0xbeff0080, 0xee0a407a,
+	0x000c0000, 0x00000000,
+	0xd760007a, 0x00011d00,
+	0xd760007b, 0x00011f00,
 	0xbefe007a, 0xbeff007b,
-	0xb8fb0742, 0x847b997b,
-	0xb8fa3b05, 0x807a817a,
-	0xbf0d997b, 0xbfa20002,
-	0x847a897a, 0xbfa00001,
-	0x847a8a7a, 0x8b7bff7f,
-	0x01ffffff, 0x807aff7a,
-	0x000001c0, 0x807a7e7a,
-	0x827b807b, 0xd7610000,
-	0x00010870, 0xd7610000,
-	0x00010a71, 0xd7610000,
-	0x00010c72, 0xd7610000,
-	0x00010e73, 0xd7610000,
-	0x00011074, 0xd7610000,
-	0x00011275, 0xd7610000,
-	0x00011476, 0xd7610000,
-	0x00011677, 0xd7610000,
-	0x00011a79, 0xd7610000,
-	0x00011c7e, 0xd7610000,
-	0x00011e7f, 0xbefe00ff,
-	0x00003fff, 0xbeff0080,
-	0xee0a407a, 0x000c0000,
-	0x00000000, 0xd760007a,
-	0x00011d00, 0xd760007b,
-	0x00011f00, 0xbefe007a,
-	0xbeff007b, 0xbef4007e,
-	0x8b75ff7f, 0x01ffffff,
-	0xbef1007d, 0xb8f30742,
-	0x84739973, 0xbefe00c1,
-	0x857d9973, 0x8b7d817d,
-	0xbf06817d, 0xbfa20002,
-	0xbeff0080, 0xbfa00002,
-	0xbeff00c1, 0xbfa0000a,
-	0xee0a4074, 0x008c0000,
-	0x00008000, 0xee0a4074,
-	0x010c0000, 0x00010000,
-	0xee0a4074, 0x018c0000,
-	0x00018000, 0xbfa00009,
-	0xee0a4074, 0x008c0000,
+	0xbef4007e, 0x8b75ff7f,
+	0x01ffffff, 0xbef1007d,
+	0xb8f30742, 0x84739973,
+	0xbefe00c1, 0x857d9973,
+	0x8b7d817d, 0xbf06817d,
+	0xbfa20002, 0xbeff0080,
+	0xbfa00002, 0xbeff00c1,
+	0xbfa0000a, 0xee0a4074,
+	0x008c0000, 0x00008000,
+	0xee0a4074, 0x010c0000,
 	0x00010000, 0xee0a4074,
-	0x010c0000, 0x00020000,
-	0xee0a4074, 0x018c0000,
-	0x00030000, 0xb8f03b05,
-	0x80708170, 0xbf0d9973,
-	0xbfa20002, 0x84708970,
-	0xbfa00001, 0x84708a70,
-	0x8070ff70, 0x00000200,
-	0x7e000280, 0x7e020280,
-	0x7e040280, 0xbefd0080,
-	0xb8faf802, 0xbf0c8b7a,
-	0xbfa20003, 0xbe804fc2,
-	0xbf94fffe, 0xbfa10001,
-	0xbe804ec4, 0xbf94fffc,
-	0xb8faf804, 0x8b7aff7a,
-	0x0001000c, 0x9178ff78,
-	0x0001000c, 0x8c787a78,
-	0xd7610002, 0x0000fa71,
-	0x807d817d, 0xd7610002,
-	0x0000fa6c, 0x807d817d,
-	0x917aff6d, 0x80000000,
-	0xd7610002, 0x0000fa7a,
-	0x807d817d, 0xd7610002,
-	0x0000fa6e, 0x807d817d,
-	0xbefa0080, 0xd7610002,
+	0x018c0000, 0x00018000,
+	0xbfa00009, 0xee0a4074,
+	0x008c0000, 0x00010000,
+	0xee0a4074, 0x010c0000,
+	0x00020000, 0xee0a4074,
+	0x018c0000, 0x00030000,
+	0xb8f03b05, 0x80708170,
+	0xbf0d9973, 0xbfa20002,
+	0x84708970, 0xbfa00001,
+	0x84708a70, 0x8070ff70,
+	0x00000200, 0x7e000280,
+	0x7e020280, 0x7e040280,
+	0xbefd0080, 0xb8faf802,
+	0xbf0c8b7a, 0xbfa20003,
+	0xbe804fc2, 0xbf94fffe,
+	0xbfa10001, 0xbe804ec4,
+	0xbf94fffc, 0xb8faf804,
+	0x8b7aff7a, 0x0001000c,
+	0x9178ff78, 0x0001000c,
+	0x8c787a78, 0xd7610002,
+	0x0000fa71, 0x807d817d,
+	0xd7610002, 0x0000fa6c,
+	0x807d817d, 0x917aff6d,
+	0x80000000, 0xd7610002,
 	0x0000fa7a, 0x807d817d,
-	0xd7610002, 0x0000fa78,
-	0x807d817d, 0xb8faf811,
+	0xd7610002, 0x0000fa6e,
+	0x807d817d, 0xbefa0080,
 	0xd7610002, 0x0000fa7a,
 	0x807d817d, 0xd7610002,
-	0x0000fa6f, 0x807d817d,
-	0xb8f1f801, 0x937aff6d,
-	0x00060019, 0x847a8c7a,
-	0x8c717a71, 0xd7610002,
-	0x0000fa71, 0x807d817d,
-	0xb8f1f814, 0xd7610002,
-	0x0000fa71, 0x807d817d,
-	0xb8f1f815, 0xd7610002,
-	0x0000fa71, 0x807d817d,
-	0xb8f1f812, 0xd7610002,
-	0x0000fa71, 0x807d817d,
-	0xb8f1f813, 0xd7610002,
-	0x0000fa71, 0x807d817d,
-	0xb8faf802, 0xd7610002,
+	0x0000fa78, 0x807d817d,
+	0xb8faf811, 0xd7610002,
 	0x0000fa7a, 0x807d817d,
-	0xbefa50c1, 0xbfc70000,
+	0xd7610002, 0x0000fa6f,
+	0x807d817d, 0xb8f1f801,
+	0x937aff6d, 0x00060019,
+	0x847a8c7a, 0x8c717a71,
+	0xd7610002, 0x0000fa71,
+	0x807d817d, 0xb8f1f814,
+	0xd7610002, 0x0000fa71,
+	0x807d817d, 0xb8f1f815,
+	0xd7610002, 0x0000fa71,
+	0x807d817d, 0xb8f1f812,
+	0xd7610002, 0x0000fa71,
+	0x807d817d, 0xb8f1f813,
+	0xd7610002, 0x0000fa71,
+	0x807d817d, 0xb8faf802,
 	0xd7610002, 0x0000fa7a,
-	0x807d817d, 0xbefa4c88,
+	0x807d817d, 0xbefa50c1,
 	0xbfc70000, 0xd7610002,
 	0x0000fa7a, 0x807d817d,
-	0xbefe00ff, 0x0000ffff,
-	0xbeff0080, 0x80767074,
-	0x82778075, 0xee0a4076,
-	0x010c0000, 0x00000000,
-	0xbefe00c1, 0x7e040280,
-	0xbefa5081, 0xbfc70000,
-	0xd7610002, 0x0001007a,
-	0xbefa5082, 0xbfc70000,
-	0xd7610002, 0x0001027a,
-	0xbefa5083, 0xbfc70000,
-	0xd7610002, 0x0001047a,
-	0xbefa5084, 0xbfc70000,
-	0xd7610002, 0x0001067a,
-	0xbefa5085, 0xbfc70000,
-	0xd7610002, 0x0001087a,
-	0xbefa5086, 0xbfc70000,
-	0xd7610002, 0x00010a7a,
-	0xbefa5087, 0xbfc70000,
-	0xd7610002, 0x00010c7a,
-	0xbefa5088, 0xbfc70000,
-	0xd7610002, 0x00010e7a,
-	0xbefa5089, 0xbfc70000,
-	0xd7610002, 0x0001107a,
-	0xbefa508a, 0xbfc70000,
-	0xd7610002, 0x0001127a,
-	0xbefa508b, 0xbfc70000,
-	0xd7610002, 0x0001147a,
-	0xbefa508c, 0xbfc70000,
-	0xd7610002, 0x0001167a,
-	0xbefa508d, 0xbfc70000,
-	0xd7610002, 0x0001187a,
-	0xbefa508e, 0xbfc70000,
-	0xd7610002, 0x00011a7a,
-	0xbefa508f, 0xbfc70000,
-	0xd7610002, 0x00011c7a,
-	0xbefa5090, 0xbfc70000,
-	0xd7610002, 0x00011e7a,
+	0xbefa4c88, 0xbfc70000,
+	0xd7610002, 0x0000fa7a,
+	0x807d817d, 0xbefe00ff,
+	0x0000ffff, 0xbeff0080,
+	0x80767074, 0x82778075,
 	0xee0a4076, 0x010c0000,
-	0x00008000, 0xb8f03b05,
-	0x80708170, 0xbf0d9973,
-	0xbfa20002, 0x84708970,
-	0xbfa00001, 0x84708a70,
-	0xbef90080, 0xbefd0080,
-	0xbf800000, 0xbe804100,
-	0xbe824102, 0xbe844104,
-	0xbe864106, 0xbe884108,
-	0xbe8a410a, 0xbe8c410c,
-	0xbe8e410e, 0xd7610002,
-	0x0000f200, 0x80798179,
-	0xd7610002, 0x0000f201,
-	0x80798179, 0xd7610002,
-	0x0000f202, 0x80798179,
-	0xd7610002, 0x0000f203,
-	0x80798179, 0xd7610002,
-	0x0000f204, 0x80798179,
-	0xd7610002, 0x0000f205,
-	0x80798179, 0xd7610002,
-	0x0000f206, 0x80798179,
-	0xd7610002, 0x0000f207,
-	0x80798179, 0xd7610002,
-	0x0000f208, 0x80798179,
-	0xd7610002, 0x0000f209,
-	0x80798179, 0xd7610002,
-	0x0000f20a, 0x80798179,
-	0xd7610002, 0x0000f20b,
-	0x80798179, 0xd7610002,
-	0x0000f20c, 0x80798179,
-	0xd7610002, 0x0000f20d,
-	0x80798179, 0xd7610002,
-	0x0000f20e, 0x80798179,
-	0xd7610002, 0x0000f20f,
-	0x80798179, 0xbf06a079,
-	0xbfa10009, 0x80767074,
-	0x82778075, 0xee0a4076,
-	0x010c0000, 0x00000000,
-	0x8070ff70, 0x00000080,
-	0xbef90080, 0x7e040280,
-	0x807d907d, 0xbf0aff7d,
-	0x00000060, 0xbfa2ffb9,
+	0x00000000, 0xbefe00c1,
+	0x7e040280, 0xbefa5081,
+	0xbfc70000, 0xd7610002,
+	0x0001007a, 0xbefa5082,
+	0xbfc70000, 0xd7610002,
+	0x0001027a, 0xbefa5083,
+	0xbfc70000, 0xd7610002,
+	0x0001047a, 0xbefa5084,
+	0xbfc70000, 0xd7610002,
+	0x0001067a, 0xbefa5085,
+	0xbfc70000, 0xd7610002,
+	0x0001087a, 0xbefa5086,
+	0xbfc70000, 0xd7610002,
+	0x00010a7a, 0xbefa5087,
+	0xbfc70000, 0xd7610002,
+	0x00010c7a, 0xbefa5088,
+	0xbfc70000, 0xd7610002,
+	0x00010e7a, 0xbefa5089,
+	0xbfc70000, 0xd7610002,
+	0x0001107a, 0xbefa508a,
+	0xbfc70000, 0xd7610002,
+	0x0001127a, 0xbefa508b,
+	0xbfc70000, 0xd7610002,
+	0x0001147a, 0xbefa508c,
+	0xbfc70000, 0xd7610002,
+	0x0001167a, 0xbefa508d,
+	0xbfc70000, 0xd7610002,
+	0x0001187a, 0xbefa508e,
+	0xbfc70000, 0xd7610002,
+	0x00011a7a, 0xbefa508f,
+	0xbfc70000, 0xd7610002,
+	0x00011c7a, 0xbefa5090,
+	0xbfc70000, 0xd7610002,
+	0x00011e7a, 0xee0a4076,
+	0x010c0000, 0x00008000,
+	0xb8f03b05, 0x80708170,
+	0xbf0d9973, 0xbfa20002,
+	0x84708970, 0xbfa00001,
+	0x84708a70, 0xbef90080,
+	0xbefd0080, 0xbf800000,
 	0xbe804100, 0xbe824102,
 	0xbe844104, 0xbe864106,
 	0xbe884108, 0xbe8a410a,
+	0xbe8c410c, 0xbe8e410e,
 	0xd7610002, 0x0000f200,
 	0x80798179, 0xd7610002,
 	0x0000f201, 0x80798179,
@@ -5052,271 +4944,307 @@ static const uint32_t cwsr_trap_gfx12_1_0_hex[] = {
 	0xd7610002, 0x0000f20a,
 	0x80798179, 0xd7610002,
 	0x0000f20b, 0x80798179,
-	0xbefe00ff, 0x0000ffff,
+	0xd7610002, 0x0000f20c,
+	0x80798179, 0xd7610002,
+	0x0000f20d, 0x80798179,
+	0xd7610002, 0x0000f20e,
+	0x80798179, 0xd7610002,
+	0x0000f20f, 0x80798179,
+	0xbf06a079, 0xbfa10009,
 	0x80767074, 0x82778075,
 	0xee0a4076, 0x010c0000,
-	0x00000000, 0xbefe00c1,
-	0x857d9973, 0x8b7d817d,
-	0xbf06817d, 0xbfa20002,
-	0xbeff0080, 0xbfa00001,
-	0xbeff00c1, 0xb8fb4306,
-	0x8b7bc17b, 0xbfa10042,
-	0x8b7aff6d, 0x80000000,
-	0xbfa1003f, 0x847b8a7b,
-	0xb8f03b05, 0x80708170,
-	0xbf0d9973, 0xbfa20002,
-	0x84708970, 0xbfa00001,
-	0x84708a70, 0x8070ff70,
-	0x00000200, 0x8070ff70,
-	0x00000200, 0xd71f0000,
-	0x000100c1, 0xd7200000,
-	0x000200c1, 0x16000084,
-	0x857d9973, 0x8b7d817d,
-	0xbf06817d, 0xbefd0080,
-	0xbfa20015, 0xbe8300ff,
-	0x00000080, 0xbf800000,
-	0xbf800000, 0xbf800000,
-	0xd8d80000, 0x01000000,
-	0xbf8a0000, 0x80767074,
+	0x00000000, 0x8070ff70,
+	0x00000080, 0xbef90080,
+	0x7e040280, 0x807d907d,
+	0xbf0aff7d, 0x00000060,
+	0xbfa2ffb9, 0xbe804100,
+	0xbe824102, 0xbe844104,
+	0xbe864106, 0xbe884108,
+	0xbe8a410a, 0xd7610002,
+	0x0000f200, 0x80798179,
+	0xd7610002, 0x0000f201,
+	0x80798179, 0xd7610002,
+	0x0000f202, 0x80798179,
+	0xd7610002, 0x0000f203,
+	0x80798179, 0xd7610002,
+	0x0000f204, 0x80798179,
+	0xd7610002, 0x0000f205,
+	0x80798179, 0xd7610002,
+	0x0000f206, 0x80798179,
+	0xd7610002, 0x0000f207,
+	0x80798179, 0xd7610002,
+	0x0000f208, 0x80798179,
+	0xd7610002, 0x0000f209,
+	0x80798179, 0xd7610002,
+	0x0000f20a, 0x80798179,
+	0xd7610002, 0x0000f20b,
+	0x80798179, 0xbefe00ff,
+	0x0000ffff, 0x80767074,
 	0x82778075, 0xee0a4076,
-	0x008c0000, 0x00000000,
-	0x807d037d, 0x80700370,
-	0xd5250000, 0x0001ff00,
-	0x00000080, 0xbf0a7b7d,
-	0xbfa2fff1, 0xbfa00014,
-	0xbe8300ff, 0x00000100,
-	0xbf800000, 0xbf800000,
-	0xbf800000, 0xd8d80000,
-	0x01000000, 0xbf8a0000,
-	0x80767074, 0x82778075,
-	0xee0a4076, 0x008c0000,
-	0x00000000, 0x807d037d,
-	0x80700370, 0xd5250000,
-	0x0001ff00, 0x00000100,
-	0xbf0a7b7d, 0xbfa2fff1,
+	0x010c0000, 0x00000000,
 	0xbefe00c1, 0x857d9973,
 	0x8b7d817d, 0xbf06817d,
-	0xbfa20004, 0xbef000ff,
-	0x00000200, 0xbeff0080,
-	0xbfa00003, 0xbef000ff,
-	0x00000400, 0xbeff00c1,
-	0xb8fb3b05, 0x807b817b,
-	0x847b827b, 0x857d9973,
+	0xbfa20002, 0xbeff0080,
+	0xbfa00001, 0xbeff00c1,
+	0xb8fb4306, 0x8b7bc17b,
+	0xbfa10042, 0x8b7aff6d,
+	0x80000000, 0xbfa1003f,
+	0x847b8a7b, 0xb8f03b05,
+	0x80708170, 0xbf0d9973,
+	0xbfa20002, 0x84708970,
+	0xbfa00001, 0x84708a70,
+	0x8070ff70, 0x00000200,
+	0x8070ff70, 0x00000200,
+	0xd71f0000, 0x000100c1,
+	0xd7200000, 0x000200c1,
+	0x16000084, 0x857d9973,
 	0x8b7d817d, 0xbf06817d,
-	0xbfa2001b, 0xbefd0084,
-	0xbf0a7b7d, 0xbfa10032,
-	0x7e008700, 0x7e028701,
-	0x7e048702, 0x7e068703,
-	0x80767074, 0x82778075,
-	0xee0a4076, 0x000c0000,
-	0x00000000, 0xee0a4076,
-	0x008c0000, 0x00008000,
-	0xee0a4076, 0x010c0000,
-	0x00010000, 0xee0a4076,
-	0x018c0000, 0x00018000,
-	0x807d847d, 0x8070ff70,
-	0x00000200, 0xbf0a7b7d,
-	0xbfa2ffe9, 0xbfa0001a,
+	0xbefd0080, 0xbfa20015,
+	0xbe8300ff, 0x00000080,
+	0xbf800000, 0xbf800000,
+	0xbf800000, 0xd8d80000,
+	0x01000000, 0xbf8a0000,
+	0x80767074, 0x82778075,
+	0xee0a4076, 0x008c0000,
+	0x00000000, 0x807d037d,
+	0x80700370, 0xd5250000,
+	0x0001ff00, 0x00000080,
+	0xbf0a7b7d, 0xbfa2fff1,
+	0xbfa00014, 0xbe8300ff,
+	0x00000100, 0xbf800000,
+	0xbf800000, 0xbf800000,
+	0xd8d80000, 0x01000000,
+	0xbf8a0000, 0x80767074,
+	0x82778075, 0xee0a4076,
+	0x008c0000, 0x00000000,
+	0x807d037d, 0x80700370,
+	0xd5250000, 0x0001ff00,
+	0x00000100, 0xbf0a7b7d,
+	0xbfa2fff1, 0xbefe00c1,
+	0x857d9973, 0x8b7d817d,
+	0xbf06817d, 0xbfa20004,
+	0xbef000ff, 0x00000200,
+	0xbeff0080, 0xbfa00003,
+	0xbef000ff, 0x00000400,
+	0xbeff00c1, 0xb8fb3b05,
+	0x807b817b, 0x847b827b,
+	0x857d9973, 0x8b7d817d,
+	0xbf06817d, 0xbfa2001b,
 	0xbefd0084, 0xbf0a7b7d,
-	0xbfa10017, 0x7e008700,
+	0xbfa10032, 0x7e008700,
 	0x7e028701, 0x7e048702,
 	0x7e068703, 0x80767074,
 	0x82778075, 0xee0a4076,
 	0x000c0000, 0x00000000,
 	0xee0a4076, 0x008c0000,
-	0x00010000, 0xee0a4076,
-	0x010c0000, 0x00020000,
+	0x00008000, 0xee0a4076,
+	0x010c0000, 0x00010000,
 	0xee0a4076, 0x018c0000,
-	0x00030000, 0x807d847d,
-	0x8070ff70, 0x00000400,
+	0x00018000, 0x807d847d,
+	0x8070ff70, 0x00000200,
 	0xbf0a7b7d, 0xbfa2ffe9,
-	0xbfa00180, 0xbef4007e,
-	0x8b75ff7f, 0x01ffffff,
-	0xbef1007f, 0xb8f20742,
-	0x84729972, 0x8b6eff7f,
-	0x04000000, 0xbfa10044,
+	0xbfa0001a, 0xbefd0084,
+	0xbf0a7b7d, 0xbfa10017,
+	0x7e008700, 0x7e028701,
+	0x7e048702, 0x7e068703,
+	0x80767074, 0x82778075,
+	0xee0a4076, 0x000c0000,
+	0x00000000, 0xee0a4076,
+	0x008c0000, 0x00010000,
+	0xee0a4076, 0x010c0000,
+	0x00020000, 0xee0a4076,
+	0x018c0000, 0x00030000,
+	0x807d847d, 0x8070ff70,
+	0x00000400, 0xbf0a7b7d,
+	0xbfa2ffe9, 0xbfa00183,
+	0xbef4007e, 0x8b75ff7f,
+	0x01ffffff, 0xbef1007f,
+	0xb8f20742, 0x84729972,
+	0x8b6eff7f, 0x04000000,
+	0xbfa10044, 0xbefe00c1,
+	0x857d9972, 0x8b7d817d,
+	0xbf06817d, 0xbfa20002,
+	0xbeff0080, 0xbfa00001,
+	0xbeff00c1, 0xb8ef4306,
+	0x8b6fc16f, 0xbfa10039,
+	0x846f8a6f, 0xb8f83b05,
+	0x80788178, 0xbf0d9972,
+	0xbfa20002, 0x84788978,
+	0xbfa00001, 0x84788a78,
+	0x8078ff78, 0x00000200,
+	0x8078ff78, 0x00000200,
+	0x857d9972, 0x8b7d817d,
+	0xbf06817d, 0xbefd0080,
+	0xd71f0001, 0x000100c1,
+	0xd7200001, 0x000202c1,
+	0x30020282, 0xbfa20012,
+	0x80767874, 0x82778075,
+	0xee0a0076, 0x000c0000,
+	0x00000000, 0xbf8a0000,
+	0xd8340000, 0x00000001,
+	0xd5250001, 0x0001ff01,
+	0x00000080, 0x807dff7d,
+	0x00000080, 0x8078ff78,
+	0x00000080, 0xbf0a6f7d,
+	0xbfa2ffef, 0xbfa00011,
+	0x80767874, 0x82778075,
+	0xee0a0076, 0x000c0000,
+	0x00000000, 0xbf8a0000,
+	0xd8340000, 0x00000001,
+	0xd5250001, 0x0001ff01,
+	0x00000100, 0x807dff7d,
+	0x00000100, 0x8078ff78,
+	0x00000100, 0xbf0a6f7d,
+	0xbfa2ffef, 0xbef80080,
 	0xbefe00c1, 0x857d9972,
 	0x8b7d817d, 0xbf06817d,
 	0xbfa20002, 0xbeff0080,
 	0xbfa00001, 0xbeff00c1,
-	0xb8ef4306, 0x8b6fc16f,
-	0xbfa10039, 0x846f8a6f,
-	0xb8f83b05, 0x80788178,
-	0xbf0d9972, 0xbfa20002,
-	0x84788978, 0xbfa00001,
-	0x84788a78, 0x8078ff78,
-	0x00000200, 0x8078ff78,
-	0x00000200, 0x857d9972,
+	0xb8ef3b05, 0x806f816f,
+	0x846f826f, 0x857d9972,
 	0x8b7d817d, 0xbf06817d,
-	0xbefd0080, 0xd71f0001,
-	0x000100c1, 0xd7200001,
-	0x000202c1, 0x30020282,
-	0xbfa20012, 0x80767874,
+	0xbfa2002c, 0xbeee0078,
+	0x8078ff78, 0x00000200,
+	0xbefd0084, 0x80767874,
 	0x82778075, 0xee0a0076,
 	0x000c0000, 0x00000000,
-	0xbf8a0000, 0xd8340000,
-	0x00000001, 0xd5250001,
-	0x0001ff01, 0x00000080,
-	0x807dff7d, 0x00000080,
-	0x8078ff78, 0x00000080,
-	0xbf0a6f7d, 0xbfa2ffef,
-	0xbfa00011, 0x80767874,
+	0xee0a0076, 0x000c0001,
+	0x00008000, 0xee0a0076,
+	0x000c0002, 0x00010000,
+	0xee0a0076, 0x000c0003,
+	0x00018000, 0xbf8a0000,
+	0x7e008500, 0x7e028501,
+	0x7e048502, 0x7e068503,
+	0x807d847d, 0x8078ff78,
+	0x00000200, 0xbf0a6f7d,
+	0xbfa2ffe8, 0x80766e74,
 	0x82778075, 0xee0a0076,
 	0x000c0000, 0x00000000,
-	0xbf8a0000, 0xd8340000,
-	0x00000001, 0xd5250001,
-	0x0001ff01, 0x00000100,
-	0x807dff7d, 0x00000100,
-	0x8078ff78, 0x00000100,
-	0xbf0a6f7d, 0xbfa2ffef,
-	0xbef80080, 0xbefe00c1,
-	0x857d9972, 0x8b7d817d,
-	0xbf06817d, 0xbfa20002,
-	0xbeff0080, 0xbfa00001,
-	0xbeff00c1, 0xb8ef3b05,
-	0x806f816f, 0x846f826f,
-	0x857d9972, 0x8b7d817d,
-	0xbf06817d, 0xbfa2002c,
-	0xbeee0078, 0x8078ff78,
-	0x00000200, 0xbefd0084,
-	0x80767874, 0x82778075,
-	0xee0a0076, 0x000c0000,
-	0x00000000, 0xee0a0076,
-	0x000c0001, 0x00008000,
-	0xee0a0076, 0x000c0002,
+	0xee0a0076, 0x000c0001,
+	0x00008000, 0xee0a0076,
+	0x000c0002, 0x00010000,
+	0xee0a0076, 0x000c0003,
+	0x00018000, 0xbf8a0000,
+	0xbfa0002d, 0xbeee0078,
+	0x8078ff78, 0x00000400,
+	0xbefd0084, 0xbf0a6f7d,
+	0xbfa10018, 0x80767874,
+	0x82778075, 0xee0a0076,
+	0x000c0000, 0x00000000,
+	0xee0a0076, 0x000c0001,
 	0x00010000, 0xee0a0076,
-	0x000c0003, 0x00018000,
-	0xbf8a0000, 0x7e008500,
-	0x7e028501, 0x7e048502,
-	0x7e068503, 0x807d847d,
-	0x8078ff78, 0x00000200,
-	0xbf0a6f7d, 0xbfa2ffe8,
-	0x80766e74, 0x82778075,
-	0xee0a0076, 0x000c0000,
-	0x00000000, 0xee0a0076,
-	0x000c0001, 0x00008000,
-	0xee0a0076, 0x000c0002,
+	0x000c0002, 0x00020000,
+	0xee0a0076, 0x000c0003,
+	0x00030000, 0xbf8a0000,
+	0x7e008500, 0x7e028501,
+	0x7e048502, 0x7e068503,
+	0x807d847d, 0x8078ff78,
+	0x00000400, 0xbf0a6f7d,
+	0xbfa2ffe8, 0x80766e74,
+	0x82778075, 0xee0a0076,
+	0x000c0000, 0x00000000,
+	0xee0a0076, 0x000c0001,
 	0x00010000, 0xee0a0076,
-	0x000c0003, 0x00018000,
-	0xbf8a0000, 0xbfa0002d,
-	0xbeee0078, 0x8078ff78,
-	0x00000400, 0xbefd0084,
-	0xbf0a6f7d, 0xbfa10018,
-	0x80767874, 0x82778075,
-	0xee0a0076, 0x000c0000,
-	0x00000000, 0xee0a0076,
-	0x000c0001, 0x00010000,
-	0xee0a0076, 0x000c0002,
-	0x00020000, 0xee0a0076,
-	0x000c0003, 0x00030000,
-	0xbf8a0000, 0x7e008500,
-	0x7e028501, 0x7e048502,
-	0x7e068503, 0x807d847d,
-	0x8078ff78, 0x00000400,
-	0xbf0a6f7d, 0xbfa2ffe8,
-	0x80766e74, 0x82778075,
-	0xee0a0076, 0x000c0000,
-	0x00000000, 0xee0a0076,
-	0x000c0001, 0x00010000,
-	0xee0a0076, 0x000c0002,
-	0x00020000, 0xee0a0076,
-	0x000c0003, 0x00030000,
-	0xbf8a0000, 0xb8f83b05,
-	0x80788178, 0xbf0d9972,
-	0xbfa20002, 0x84788978,
-	0xbfa00001, 0x84788a78,
-	0x8078ff78, 0x00000200,
-	0x80f8ff78, 0x00000060,
-	0x80767874, 0x82778075,
-	0xbefd00ff, 0x0000006c,
-	0xf460403b, 0xf8000000,
-	0xbf8a0000, 0x80fd847d,
-	0xbf800000, 0xbe804300,
-	0xbe824302, 0x80f6a076,
-	0x82f78077, 0xf460603b,
+	0x000c0002, 0x00020000,
+	0xee0a0076, 0x000c0003,
+	0x00030000, 0xbf8a0000,
+	0xb8f83b05, 0x80788178,
+	0xbf0d9972, 0xbfa20002,
+	0x84788978, 0xbfa00001,
+	0x84788a78, 0x8078ff78,
+	0x00000200, 0x80f8ff78,
+	0x00000060, 0x80767874,
+	0x82778075, 0xbefd00ff,
+	0x0000006c, 0xf460403b,
 	0xf8000000, 0xbf8a0000,
-	0x80fd887d, 0xbf800000,
+	0x80fd847d, 0xbf800000,
 	0xbe804300, 0xbe824302,
-	0xbe844304, 0xbe864306,
-	0x80f6c076, 0x82f78077,
-	0xf460803b, 0xf8000000,
-	0xbf8a0000, 0x80fd907d,
+	0x80f6a076, 0x82f78077,
+	0xf460603b, 0xf8000000,
+	0xbf8a0000, 0x80fd887d,
 	0xbf800000, 0xbe804300,
 	0xbe824302, 0xbe844304,
-	0xbe864306, 0xbe884308,
-	0xbe8a430a, 0xbe8c430c,
-	0xbe8e430e, 0xbf06807d,
-	0xbfa1ffef, 0xb980f801,
-	0x00000000, 0xb8f83b05,
-	0x80788178, 0xbf0d9972,
-	0xbfa20002, 0x84788978,
-	0xbfa00001, 0x84788a78,
-	0x8078ff78, 0x00000200,
-	0x80767874, 0x82778075,
-	0xbeff0071, 0xf4601bfb,
-	0xf8000000, 0xf4601b3b,
-	0xf8000004, 0xf4601b7b,
-	0xf8000008, 0xf4601c3b,
-	0xf800000c, 0xf4601c7b,
-	0xf8000010, 0xf4601ebb,
-	0xf8000014, 0xf4601efb,
-	0xf8000018, 0xf4601e7b,
-	0xf800001c, 0xf4601cfb,
-	0xf8000020, 0xf4601bbb,
-	0xf8000024, 0xbf8a0000,
-	0xb96ef814, 0xf4601bbb,
-	0xf8000028, 0xbf8a0000,
-	0xb96ef815, 0xf4601bbb,
-	0xf800002c, 0xbf8a0000,
-	0xb96ef812, 0xf4601bbb,
-	0xf8000030, 0xbf8a0000,
-	0xb96ef813, 0x8b6eff7f,
-	0x04000000, 0xbfa10022,
-	0xf4601bbb, 0xf8000038,
-	0xbf8a0000, 0xbf0d806e,
-	0xbfa1001d, 0x856e906e,
-	0x8b6e6e6e, 0xbfa10003,
-	0xbe804ec1, 0x816ec16e,
-	0xbfa0fffb, 0xbef800ff,
-	0x00000080, 0xbefd0081,
-	0xf4601bbb, 0xf0000000,
-	0xbfc70000, 0x80788478,
-	0x937eff6e, 0x00070004,
-	0x847e907e, 0x8c7d7e7d,
-	0xbe80517d, 0x917dff7d,
-	0x007f0000, 0x856e906e,
-	0x8b6e6e6e, 0xbfa10003,
-	0xbe804e7d, 0x816ec16e,
-	0xbfa0fffb, 0x807d817d,
-	0xbf08907d, 0xbfa1ffec,
-	0xf4601bbb, 0xf800003c,
-	0xbfc70000, 0xbf0d806e,
-	0xbfa1000c, 0xbf0d9a7f,
-	0xbfa10002, 0xbf068180,
-	0xbe804fc4, 0xbf94fffc,
-	0xbfa10006, 0x856e906e,
-	0x8b6e6e6e, 0xbfa10003,
-	0xbe804ec3, 0x816ec16e,
-	0xbfa0fffb, 0xbefd006f,
-	0xbefe0070, 0xbeff0071,
-	0xb979f822, 0xb97b2011,
-	0x857b867b, 0xb97b0191,
-	0x857b827b, 0xb97bba11,
-	0xb973f801, 0xb8ee3b05,
-	0x806e816e, 0xbf0d9972,
-	0xbfa20002, 0x846e896e,
-	0xbfa00001, 0x846e8a6e,
-	0x806eff6e, 0x000001c0,
-	0x806e746e, 0x826f8075,
-	0xf4605c37, 0xf8000010,
-	0xf4605d37, 0xf8000020,
-	0xf4601e77, 0xf8000034,
-	0xbf8a0000, 0x856e9677,
-	0xb96e04a1, 0x856e9577,
-	0xb96e0421, 0x856e8e77,
-	0xb96e3021, 0x8b6dff6d,
-	0x01ffffff, 0x8bfe7e7e,
-	0x8bea6a6a, 0xb97af804,
+	0xbe864306, 0x80f6c076,
+	0x82f78077, 0xf460803b,
+	0xf8000000, 0xbf8a0000,
+	0x80fd907d, 0xbf800000,
+	0xbe804300, 0xbe824302,
+	0xbe844304, 0xbe864306,
+	0xbe884308, 0xbe8a430a,
+	0xbe8c430c, 0xbe8e430e,
+	0xbf06807d, 0xbfa1ffef,
+	0xb980f801, 0x00000000,
+	0xb8f83b05, 0x80788178,
+	0xbf0d9972, 0xbfa20002,
+	0x84788978, 0xbfa00001,
+	0x84788a78, 0x8078ff78,
+	0x00000200, 0x80767874,
+	0x82778075, 0xbeff0071,
+	0xf4601bfb, 0xf8000000,
+	0xf4601b3b, 0xf8000004,
+	0xf4601b7b, 0xf8000008,
+	0xf4601c3b, 0xf800000c,
+	0xf4601c7b, 0xf8000010,
+	0xf4601ebb, 0xf8000014,
+	0xf4601efb, 0xf8000018,
+	0xf4601e7b, 0xf800001c,
+	0xf4601cfb, 0xf8000020,
+	0xf4601bbb, 0xf8000024,
+	0xbf8a0000, 0xb96ef814,
+	0xf4601bbb, 0xf8000028,
+	0xbf8a0000, 0xb96ef815,
+	0xf4601bbb, 0xf800002c,
+	0xbf8a0000, 0xb96ef812,
+	0xf4601bbb, 0xf8000030,
+	0xbf8a0000, 0xb96ef813,
+	0x8b6eff7f, 0x04000000,
+	0xbfa10022, 0xf4601bbb,
+	0xf8000038, 0xbf8a0000,
+	0xbf0d806e, 0xbfa1001d,
+	0x856e906e, 0x8b6e6e6e,
+	0xbfa10003, 0xbe804ec1,
+	0x816ec16e, 0xbfa0fffb,
+	0xbef800ff, 0x00000080,
+	0xbefd0081, 0xf4601bbb,
+	0xf0000000, 0xbfc70000,
+	0x80788478, 0x937eff6e,
+	0x00070004, 0x847e907e,
+	0x8c7d7e7d, 0xbe80517d,
+	0x917dff7d, 0x007f0000,
+	0x856e906e, 0x8b6e6e6e,
+	0xbfa10003, 0xbe804e7d,
+	0x816ec16e, 0xbfa0fffb,
+	0x807d817d, 0xbf08907d,
+	0xbfa1ffec, 0xf4601bbb,
+	0xf800003c, 0xbfc70000,
+	0xbf0d806e, 0xbfa1000c,
+	0xbf0d9a7f, 0xbfa10002,
+	0xbf068180, 0xbe804fc4,
+	0xbf94fffc, 0xbfa10006,
+	0x856e906e, 0x8b6e6e6e,
+	0xbfa10003, 0xbe804ec3,
+	0x816ec16e, 0xbfa0fffb,
+	0xbefd006f, 0xbefe0070,
+	0xbeff0071, 0xb979f822,
+	0xb97b2011, 0x857b867b,
+	0xb97b0191, 0x857b827b,
+	0xb97bba11, 0xb973f801,
+	0xb8ee3b05, 0x806e816e,
+	0xbf0d9972, 0xbfa20002,
+	0x846e896e, 0xbfa00001,
+	0x846e8a6e, 0x806eff6e,
+	0x000001c0, 0x806e746e,
+	0x826f8075, 0xf4605c37,
+	0xf8000010, 0xf4605d37,
+	0xf8000020, 0xf4601e77,
+	0xf8000034, 0xbf8a0000,
+	0x856e9677, 0xb96e04a1,
+	0x856e9577, 0xb96e0421,
+	0x856e8e77, 0xb96e3021,
+	0x8b6dff6d, 0x01ffffff,
+	0x8bfe7e7e, 0x8bea6a6a,
+	0x936eff77, 0x0002001a,
+	0xb96ef81a, 0xb97af804,
 	0xb8eef802, 0xbf0c8b6e,
 	0xbfa20003, 0xbe804fc2,
 	0xbf94fffe, 0xbfa10001,
diff --git a/drivers/gpu/drm/amd/amdkfd/cwsr_trap_handler_gfx12.asm b/drivers/gpu/drm/amd/amdkfd/cwsr_trap_handler_gfx12.asm
index b7b82f1c6072..1624a02ad0ef 100644
--- a/drivers/gpu/drm/amd/amdkfd/cwsr_trap_handler_gfx12.asm
+++ b/drivers/gpu/drm/amd/amdkfd/cwsr_trap_handler_gfx12.asm
@@ -1329,6 +1329,7 @@ end
 function restore_sched_mode(s_tmp)
 	s_bfe_u32	s_tmp, ttmp11, (TTMP11_SCHED_MODE_SHIFT | (TTMP11_SCHED_MODE_SIZE << 0x10))
 	s_setreg_b32	hwreg(HW_REG_WAVE_SCHED_MODE), s_tmp
+end
 
 function restore_barrier_signal_count(barrier_id)
 	// extract the saved signal count from s_restore_tmp
-- 
2.34.1


^ permalink raw reply related	[flat|nested] 14+ messages in thread

* [PATCH 2/5] drm/amdkfd: Fix scalar load ordering in gfx12.1 trap handler
  2026-01-16 20:39 [PATCH 0/5] drm/amdkfd: Trap handler fixes and gfx12.1 support Jay Cornwall
  2026-01-16 20:39 ` [PATCH 1/5] drm/amdkfd: Sync trap handler binary with source Jay Cornwall
@ 2026-01-16 20:39 ` Jay Cornwall
  2026-01-20 22:38   ` Lancelot SIX
  2026-01-16 20:39 ` [PATCH 3/5] drm/amdkfd: gfx12.1 cluster barrier context save workaround Jay Cornwall
                   ` (2 subsequent siblings)
  4 siblings, 1 reply; 14+ messages in thread
From: Jay Cornwall @ 2026-01-16 20:39 UTC (permalink / raw)
  To: amd-gfx; +Cc: Jay Cornwall, Lancelot Six, Joseph Greathouse, Vladimir Indic

Scalar loads may arrive out-of-order with respect to KMCNT.
The affected code expects the two loads to arrive in-order.

Signed-off-by: Jay Cornwall <jay.cornwall@amd.com>
Cc: Lancelot Six <lancelot.six@amd.com>
Cc: Joseph Greathouse <joseph.greathouse@amd.com>
Cc: Vladimir Indic <vladimir.indic@amd.com>
---
 drivers/gpu/drm/amd/amdkfd/cwsr_trap_handler.h         | 8 ++++----
 drivers/gpu/drm/amd/amdkfd/cwsr_trap_handler_gfx12.asm | 2 +-
 2 files changed, 5 insertions(+), 5 deletions(-)

diff --git a/drivers/gpu/drm/amd/amdkfd/cwsr_trap_handler.h b/drivers/gpu/drm/amd/amdkfd/cwsr_trap_handler.h
index 6281b2f9faee..453c08845d74 100644
--- a/drivers/gpu/drm/amd/amdkfd/cwsr_trap_handler.h
+++ b/drivers/gpu/drm/amd/amdkfd/cwsr_trap_handler.h
@@ -4638,8 +4638,8 @@ static const uint32_t cwsr_trap_gfx12_1_0_hex[] = {
 	0x01ffffff, 0xb8fbf811,
 	0xbf0d847b, 0xbfa20078,
 	0xf4003eb6, 0xf8000000,
-	0xf4003bb6, 0xf8000008,
-	0xbfc70001, 0x8b76ff7a,
+	0xbfc70000, 0xf4003bb6,
+	0xf8000008, 0x8b76ff7a,
 	0x80000000, 0xbfa20027,
 	0x9376ff7a, 0x00060019,
 	0x81f9a376, 0xbf0b8179,
@@ -4717,8 +4717,8 @@ static const uint32_t cwsr_trap_gfx12_1_0_hex[] = {
 	0xb980f821, 0x00000000,
 	0xbf0d847b, 0xbfa20078,
 	0xf4003eb6, 0xf8000000,
-	0xf4003bb6, 0xf8000008,
-	0xbfc70001, 0x8b76ff7a,
+	0xbfc70000, 0xf4003bb6,
+	0xf8000008, 0x8b76ff7a,
 	0x80000000, 0xbfa20027,
 	0x9376ff7a, 0x00060019,
 	0x81f9a376, 0xbf0b8179,
diff --git a/drivers/gpu/drm/amd/amdkfd/cwsr_trap_handler_gfx12.asm b/drivers/gpu/drm/amd/amdkfd/cwsr_trap_handler_gfx12.asm
index 1624a02ad0ef..7ed4b502eb22 100644
--- a/drivers/gpu/drm/amd/amdkfd/cwsr_trap_handler_gfx12.asm
+++ b/drivers/gpu/drm/amd/amdkfd/cwsr_trap_handler_gfx12.asm
@@ -1357,8 +1357,8 @@ function fixup_vgpr_bank_selection
 	// ttmp[0:1]: {7b'0} PC[56:0]
 	// ttmp2, 3, 10, 13, 14, 15: free
 	s_load_b64	[ttmp14, ttmp15], [ttmp0, ttmp1], 0 scope:SCOPE_CU	// Load the 2 instruction DW we are returning to
+	s_wait_kmcnt	0
 	s_load_b64	[ttmp2, ttmp3], [ttmp0, ttmp1], 8 scope:SCOPE_CU	// Load the next 2 instruction DW, just in case
-	s_wait_kmcnt	1
 	s_and_b32	ttmp10, ttmp14, 0x80000000				// Check bit 31 in the first DWORD
 										// SCC set if ttmp10 is != 0, i.e. if bit 31 == 1
 	s_cbranch_scc1	L_FIXUP_NOT_VOP12C					// If bit 31 is 1, we are not VOP1, VOP2, or VOP3C
-- 
2.34.1


^ permalink raw reply related	[flat|nested] 14+ messages in thread

* [PATCH 3/5] drm/amdkfd: gfx12.1 cluster barrier context save workaround
  2026-01-16 20:39 [PATCH 0/5] drm/amdkfd: Trap handler fixes and gfx12.1 support Jay Cornwall
  2026-01-16 20:39 ` [PATCH 1/5] drm/amdkfd: Sync trap handler binary with source Jay Cornwall
  2026-01-16 20:39 ` [PATCH 2/5] drm/amdkfd: Fix scalar load ordering in gfx12.1 trap handler Jay Cornwall
@ 2026-01-16 20:39 ` Jay Cornwall
  2026-01-20 23:27   ` Lancelot SIX
  2026-01-16 20:39 ` [PATCH 4/5] drm/amdkfd: gfx12.1 trap handler support for expert scheduling mode Jay Cornwall
  2026-01-16 20:39 ` [PATCH 5/5] drm/amdkfd: Do not include VGPR MSBs in saved PC during save Jay Cornwall
  4 siblings, 1 reply; 14+ messages in thread
From: Jay Cornwall @ 2026-01-16 20:39 UTC (permalink / raw)
  To: amd-gfx
  Cc: Jay Cornwall, Gang Ba, Harish Kasiviswanathan, Lancelot Six,
	Vladimir Indic

Trap cluster barrier may not serialize with user cluster barrier
under some circumstances. Add a check for pending user cluster
barrier complete.

Signed-off-by: Jay Cornwall <jay.cornwall@amd.com>
Tested-by: Gang Ba <Gang.Ba@amd.com>
Cc: Harish Kasiviswanathan <Harish.Kasiviswanathan@amd.com>
Cc: Lancelot Six <lancelot.six@amd.com>
Cc: Vladimir Indic <vladimir.indic@amd.com>
---
 .../gpu/drm/amd/amdkfd/cwsr_trap_handler.h    | 31 +++++++++-------
 .../amd/amdkfd/cwsr_trap_handler_gfx12.asm    | 36 +++++++++++++++----
 2 files changed, 47 insertions(+), 20 deletions(-)

diff --git a/drivers/gpu/drm/amd/amdkfd/cwsr_trap_handler.h b/drivers/gpu/drm/amd/amdkfd/cwsr_trap_handler.h
index 453c08845d74..d86bccc49e3f 100644
--- a/drivers/gpu/drm/amd/amdkfd/cwsr_trap_handler.h
+++ b/drivers/gpu/drm/amd/amdkfd/cwsr_trap_handler.h
@@ -3754,11 +3754,11 @@ static const uint32_t cwsr_trap_gfx12_hex[] = {
 	0x84708a70, 0x8070ff70,
 	0x00000200, 0x7e000280,
 	0x7e020280, 0x7e040280,
-	0xbefd0080, 0xbe804ec2,
-	0xbf94fffe, 0xb8faf804,
-	0x8b7a847a, 0x91788478,
-	0x8c787a78, 0xd7610002,
+	0xbefd0080, 0xd7610002,
 	0x0000fa71, 0x807d817d,
+	0xbe804ec2, 0xbf94fffe,
+	0xb8faf804, 0x8b7a847a,
+	0x91788478, 0x8c787a78,
 	0xd7610002, 0x0000fa6c,
 	0x807d817d, 0x917aff6d,
 	0x80000000, 0xd7610002,
@@ -4587,7 +4587,7 @@ static const uint32_t cwsr_trap_gfx9_5_0_hex[] = {
 };
 
 static const uint32_t cwsr_trap_gfx12_1_0_hex[] = {
-	0xbfa00001, 0xbfa003aa,
+	0xbfa00001, 0xbfa003b4,
 	0xb0804009, 0xb8eef81a,
 	0xbf880000, 0xb980081a,
 	0x00000000, 0xb8f8f804,
@@ -4838,15 +4838,20 @@ static const uint32_t cwsr_trap_gfx12_1_0_hex[] = {
 	0x84708a70, 0x8070ff70,
 	0x00000200, 0x7e000280,
 	0x7e020280, 0x7e040280,
-	0xbefd0080, 0xb8faf802,
-	0xbf0c8b7a, 0xbfa20003,
-	0xbe804fc2, 0xbf94fffe,
-	0xbfa10001, 0xbe804ec4,
-	0xbf94fffc, 0xb8faf804,
-	0x8b7aff7a, 0x0001000c,
-	0x9178ff78, 0x0001000c,
-	0x8c787a78, 0xd7610002,
+	0xbefd0080, 0xd7610002,
 	0x0000fa71, 0x807d817d,
+	0xb8faf802, 0xbf0c8b7a,
+	0xbfa20003, 0xbe804fc2,
+	0xbf94fffe, 0xbfa10001,
+	0xbe804ec4, 0xbf94fffc,
+	0xbefa4c88, 0xbfc70000,
+	0xbf0c807a, 0xbfa20006,
+	0x9371ff7a, 0x00070004,
+	0x937aff7a, 0x00070010,
+	0xbf06717a, 0xbfa2fff6,
+	0xb8faf804, 0x8b7aff7a,
+	0x0001000c, 0x9178ff78,
+	0x0001000c, 0x8c787a78,
 	0xd7610002, 0x0000fa6c,
 	0x807d817d, 0x917aff6d,
 	0x80000000, 0xd7610002,
diff --git a/drivers/gpu/drm/amd/amdkfd/cwsr_trap_handler_gfx12.asm b/drivers/gpu/drm/amd/amdkfd/cwsr_trap_handler_gfx12.asm
index 7ed4b502eb22..ace2a9f2ac73 100644
--- a/drivers/gpu/drm/amd/amdkfd/cwsr_trap_handler_gfx12.asm
+++ b/drivers/gpu/drm/amd/amdkfd/cwsr_trap_handler_gfx12.asm
@@ -35,6 +35,7 @@
 #define HAVE_BANKED_VGPRS (ASIC_FAMILY == CHIP_GC_12_0_3)
 #define NUM_NAMED_BARRIERS (ASIC_FAMILY == CHIP_GC_12_0_3 ? 0x10 : 0)
 #define HAVE_CLUSTER_BARRIER (ASIC_FAMILY == CHIP_GC_12_0_3)
+#define CLUSTER_BARRIER_SERIALIZE_WORKAROUND (ASIC_FAMILY == CHIP_GC_12_0_3)
 
 #define SINGLE_STEP_MISSED_WORKAROUND 1	//workaround for lost TRAP_AFTER_INST exception when SAVECTX raised
 #define HAVE_VALU_SGPR_HAZARD (ASIC_FAMILY == CHIP_GFX12)
@@ -104,6 +105,7 @@ var SQ_WAVE_SCHED_MODE_DEP_MODE_SHIFT		= 0
 var SQ_WAVE_SCHED_MODE_DEP_MODE_SIZE		= 2
 
 var BARRIER_STATE_SIGNAL_OFFSET			= 16
+var BARRIER_STATE_SIGNAL_SIZE			= 7
 var BARRIER_STATE_MEMBER_OFFSET			= 4
 var BARRIER_STATE_MEMBER_SIZE			= 7
 var BARRIER_STATE_VALID_OFFSET			= 0
@@ -520,9 +522,11 @@ L_SAVE_HWREG:
 	v_mov_b32	v2, 0x0							//Set of SGPRs for TCP store
 	s_mov_b32	m0, 0x0							//Next lane of v2 to write to
 
+	write_hwreg_to_v2(s_save_m0)
+
 	// Ensure no further changes to barrier or LDS state.
 	// STATE_PRIV.*BARRIER_COMPLETE may change up to this point.
-	wait_trap_barriers(s_save_tmp)
+	wait_trap_barriers(s_save_tmp, s_save_m0, 1)
 
 	// Re-read final state of *BARRIER_COMPLETE fields for save.
 	s_getreg_b32	s_save_tmp, hwreg(HW_REG_WAVE_STATE_PRIV)
@@ -530,7 +534,6 @@ L_SAVE_HWREG:
 	s_andn2_b32	s_save_state_priv, s_save_state_priv, SQ_WAVE_STATE_PRIV_ALL_BARRIER_COMPLETE_MASK
 	s_or_b32	s_save_state_priv, s_save_state_priv, s_save_tmp
 
-	write_hwreg_to_v2(s_save_m0)
 	write_hwreg_to_v2(s_save_pc_lo)
 	s_andn2_b32	s_save_tmp, s_save_pc_hi, S_SAVE_PC_HI_FIRST_WAVE_MASK
 	write_hwreg_to_v2(s_save_tmp)
@@ -1198,7 +1201,7 @@ L_SKIP_CLUSTER_BARRIER_RESTORE:
 
 	// Make barrier and LDS state visible to all waves in the group/cluster.
 	// STATE_PRIV.*BARRIER_COMPLETE may change after this point.
-	wait_trap_barriers(s_restore_tmp)
+	wait_trap_barriers(s_restore_tmp, 0, 0)
 
 #if HAVE_CLUSTER_BARRIER
 	// SCC is changed by wait_trap_barriers, restore it separately.
@@ -1211,7 +1214,7 @@ L_SKIP_CLUSTER_BARRIER_RESTORE:
 L_END_PGM:
 	// Make sure that no wave of the group/cluster can exit the trap handler
 	// before the group/cluster barrier state is saved.
-	wait_trap_barriers(s_restore_tmp)
+	wait_trap_barriers(s_restore_tmp, 0, 0)
 
 	s_endpgm_saved
 end
@@ -1301,11 +1304,11 @@ function restore_xnack_state_priv(s_tmp)
 end
 #endif
 
-function wait_trap_barriers(s_tmp)
+function wait_trap_barriers(s_tmp1, s_tmp2, serialize_wa)
 #if HAVE_CLUSTER_BARRIER
 	// If not in a WG then wave cannot use s_barrier_signal_isfirst.
-	s_getreg_b32	s_tmp, hwreg(HW_REG_WAVE_STATUS)
-	s_bitcmp0_b32	s_tmp, SQ_WAVE_STATUS_IN_WG_SHIFT
+	s_getreg_b32	s_tmp1, hwreg(HW_REG_WAVE_STATUS)
+	s_bitcmp0_b32	s_tmp1, SQ_WAVE_STATUS_IN_WG_SHIFT
 	s_cbranch_scc1	L_TRAP_CLUSTER_BARRIER_SIGNAL
 
 	s_barrier_signal_isfirst	-2
@@ -1319,6 +1322,25 @@ L_TRAP_CLUSTER_BARRIER_SIGNAL:
 
 L_SKIP_TRAP_CLUSTER_BARRIER_SIGNAL:
 	s_barrier_wait	-4
+
+#if CLUSTER_BARRIER_SERIALIZE_WORKAROUND
+if serialize_wa
+	// Trap cluster barrier may complete with a user cluster barrier in-flight.
+	// This is indicated if user cluster member count and signal count are equal.
+L_WAIT_USER_CLUSTER_BARRIER_COMPLETE:
+	s_sendmsg_rtn_b32	s_tmp1, sendmsg(MSG_RTN_GET_CLUSTER_BARRIER_STATE)
+	s_wait_kmcnt	0
+	s_bitcmp0_b32	s_tmp1, BARRIER_STATE_VALID_OFFSET
+	s_cbranch_scc1	L_NOT_IN_CLUSTER
+
+	s_bfe_u32	s_tmp2, s_tmp1, (BARRIER_STATE_MEMBER_OFFSET | (BARRIER_STATE_MEMBER_SIZE << 0x10))
+	s_bfe_u32	s_tmp1, s_tmp1, (BARRIER_STATE_SIGNAL_OFFSET | (BARRIER_STATE_SIGNAL_SIZE << 0x10))
+	s_cmp_eq_u32	s_tmp1, s_tmp2
+	s_cbranch_scc1	L_WAIT_USER_CLUSTER_BARRIER_COMPLETE
+end
+L_NOT_IN_CLUSTER:
+#endif
+
 #else
 	s_barrier_signal	-2
 	s_barrier_wait	-2
-- 
2.34.1


^ permalink raw reply related	[flat|nested] 14+ messages in thread

* [PATCH 4/5] drm/amdkfd: gfx12.1 trap handler support for expert scheduling mode
  2026-01-16 20:39 [PATCH 0/5] drm/amdkfd: Trap handler fixes and gfx12.1 support Jay Cornwall
                   ` (2 preceding siblings ...)
  2026-01-16 20:39 ` [PATCH 3/5] drm/amdkfd: gfx12.1 cluster barrier context save workaround Jay Cornwall
@ 2026-01-16 20:39 ` Jay Cornwall
  2026-01-20 23:30   ` Lancelot SIX
  2026-01-16 20:39 ` [PATCH 5/5] drm/amdkfd: Do not include VGPR MSBs in saved PC during save Jay Cornwall
  4 siblings, 1 reply; 14+ messages in thread
From: Jay Cornwall @ 2026-01-16 20:39 UTC (permalink / raw)
  To: amd-gfx; +Cc: Jay Cornwall, Lancelot Six, Vladimir Indic

- Leave DEP_MODE unchanged as it is ignored in the trap handler
- Save/restore SCHED_MODE (gfx12.0 saves in ttmp11)

Signed-off-by: Jay Cornwall <jay.cornwall@amd.com>
Cc: Lancelot Six <lancelot.six@amd.com>
Cc: Vladimir Indic <vladimir.indic@amd.com>
---
 .../gpu/drm/amd/amdkfd/cwsr_trap_handler.h    | 372 +++++++++---------
 .../amd/amdkfd/cwsr_trap_handler_gfx12.asm    |  32 +-
 2 files changed, 214 insertions(+), 190 deletions(-)

diff --git a/drivers/gpu/drm/amd/amdkfd/cwsr_trap_handler.h b/drivers/gpu/drm/amd/amdkfd/cwsr_trap_handler.h
index d86bccc49e3f..9bb7fb6a83ed 100644
--- a/drivers/gpu/drm/amd/amdkfd/cwsr_trap_handler.h
+++ b/drivers/gpu/drm/amd/amdkfd/cwsr_trap_handler.h
@@ -4587,18 +4587,14 @@ static const uint32_t cwsr_trap_gfx9_5_0_hex[] = {
 };
 
 static const uint32_t cwsr_trap_gfx12_1_0_hex[] = {
-	0xbfa00001, 0xbfa003b4,
-	0xb0804009, 0xb8eef81a,
-	0xbf880000, 0xb980081a,
-	0x00000000, 0xb8f8f804,
-	0x9177ff77, 0x0c000000,
-	0x846e9a6e, 0x8c776e77,
+	0xbfa00001, 0xbfa003ac,
+	0xb0804009, 0xb8f8f804,
 	0x9178ff78, 0x00008c00,
 	0xb8fbf811, 0x8b6eff78,
 	0x00004000, 0xbfa10008,
 	0x8b6eff7b, 0x00000080,
 	0xbfa20018, 0x8b6ea07b,
-	0xbfa200d4, 0xbf830010,
+	0xbfa200d1, 0xbf830010,
 	0xb8fbf811, 0xbfa0fffb,
 	0x8b6eff7b, 0x00000bd0,
 	0xbfa20010, 0xb8eef812,
@@ -4609,7 +4605,7 @@ static const uint32_t cwsr_trap_gfx12_1_0_hex[] = {
 	0xf0000000, 0xbfa20005,
 	0x8b6fff6f, 0x00000200,
 	0xbfa20002, 0x8b6ea07b,
-	0xbfa200be, 0x9177ff77,
+	0xbfa200bb, 0x9177ff77,
 	0x007fc000, 0xb8fa04a1,
 	0x847a967a, 0x8c777a77,
 	0xb8fa0421, 0x847a957a,
@@ -4702,189 +4698,189 @@ static const uint32_t cwsr_trap_gfx12_1_0_hex[] = {
 	0xb97a0421, 0x857a8e77,
 	0xb97a3021, 0x8bfe7e7e,
 	0x8bea6a6a, 0x85788978,
-	0x936eff77, 0x0002001a,
-	0xb96ef81a, 0xb9783244,
-	0xbe804a6c, 0xb8faf802,
-	0xbf0d987a, 0xbfa10001,
-	0xbfb00000, 0x8b6dff6d,
-	0x01ffffff, 0xbefa0080,
-	0xb97a0151, 0x9177ff77,
-	0x007fc000, 0xb8fa04a1,
-	0x847a967a, 0x8c777a77,
-	0xb8fa0421, 0x847a957a,
-	0x8c777a77, 0xb8fa3021,
-	0x847a8e7a, 0x8c777a77,
-	0xb980f821, 0x00000000,
-	0xbf0d847b, 0xbfa20078,
-	0xf4003eb6, 0xf8000000,
-	0xbfc70000, 0xf4003bb6,
-	0xf8000008, 0x8b76ff7a,
-	0x80000000, 0xbfa20027,
-	0x9376ff7a, 0x00060019,
-	0x81f9a376, 0xbf0b8179,
-	0xbfa20068, 0x81f9ac76,
-	0xbf0b8179, 0xbfa20062,
-	0x81f9b776, 0xbf0b8179,
-	0xbfa2005f, 0x8b76ff7a,
-	0x000001ff, 0xbf06ff76,
-	0x000000fe, 0xbfa2005d,
-	0xbf06ff76, 0x000000ff,
-	0xbfa20057, 0xbf06ff76,
-	0x000000fa, 0xbfa20054,
-	0x81f9ff76, 0x000000e9,
-	0xbf0b8179, 0xbfa20050,
-	0x8b76ff7b, 0xffff0000,
-	0xbf06ff76, 0xbf860000,
-	0xbfa10051, 0x9376ff7b,
-	0x0002000e, 0x8b79ff7b,
-	0x00003f00, 0x85798679,
-	0x8c767976, 0xb9763b01,
-	0xbfa00049, 0x8b76ff7a,
-	0xfc000000, 0xbf06ff76,
-	0xd4000000, 0xbfa20013,
-	0xbf06ff76, 0xc8000000,
-	0xbfa20027, 0x8b76ff7a,
-	0xff000000, 0xbf06ff76,
-	0xcf000000, 0xbfa20039,
-	0x8b79ff7a, 0xffff0000,
-	0xbf06ff79, 0xcc350000,
-	0xbfa20037, 0xbf06ff79,
-	0xcc3a0000, 0xbfa20034,
-	0xbf06ff76, 0xcc000000,
-	0xbfa10031, 0x8b76ff7b,
-	0x000001ff, 0xbf06ff76,
-	0x000000ff, 0xbfa20029,
-	0xbf06ff76, 0x000000fa,
-	0xbfa20026, 0x81f6ff76,
-	0x000000e9, 0xbf0b8176,
-	0xbfa20022, 0x8b76ff7b,
-	0x0003fe00, 0xbf06ff76,
-	0x0001fe00, 0xbfa2001d,
-	0x8b76ff7b, 0x07fc0000,
-	0xbf06ff76, 0x03fc0000,
-	0xbfa20018, 0xbfa00014,
-	0x9376ff7a, 0x00040016,
-	0x81f68176, 0xbf0b8176,
-	0xbfa20012, 0x9376ff7a,
-	0x00050011, 0x81f68176,
-	0xbf0b8176, 0xbfa2000d,
+	0xb9783244, 0xbe804a6c,
+	0xb8faf802, 0xbf0d987a,
+	0xbfa10001, 0xbfb00000,
+	0x8b6dff6d, 0x01ffffff,
+	0xbefa0080, 0xb97a0151,
+	0x9177ff77, 0x007fc000,
+	0xb8fa04a1, 0x847a967a,
+	0x8c777a77, 0xb8fa0421,
+	0x847a957a, 0x8c777a77,
+	0xb8fa3021, 0x847a8e7a,
+	0x8c777a77, 0xb980f821,
+	0x00000000, 0xbf0d847b,
+	0xbfa20078, 0xf4003eb6,
+	0xf8000000, 0xbfc70000,
+	0xf4003bb6, 0xf8000008,
+	0x8b76ff7a, 0x80000000,
+	0xbfa20027, 0x9376ff7a,
+	0x00060019, 0x81f9a376,
+	0xbf0b8179, 0xbfa20068,
+	0x81f9ac76, 0xbf0b8179,
+	0xbfa20062, 0x81f9b776,
+	0xbf0b8179, 0xbfa2005f,
 	0x8b76ff7a, 0x000001ff,
+	0xbf06ff76, 0x000000fe,
+	0xbfa2005d, 0xbf06ff76,
+	0x000000ff, 0xbfa20057,
+	0xbf06ff76, 0x000000fa,
+	0xbfa20054, 0x81f9ff76,
+	0x000000e9, 0xbf0b8179,
+	0xbfa20050, 0x8b76ff7b,
+	0xffff0000, 0xbf06ff76,
+	0xbf860000, 0xbfa10051,
+	0x9376ff7b, 0x0002000e,
+	0x8b79ff7b, 0x00003f00,
+	0x85798679, 0x8c767976,
+	0xb9763b01, 0xbfa00049,
+	0x8b76ff7a, 0xfc000000,
+	0xbf06ff76, 0xd4000000,
+	0xbfa20013, 0xbf06ff76,
+	0xc8000000, 0xbfa20027,
+	0x8b76ff7a, 0xff000000,
+	0xbf06ff76, 0xcf000000,
+	0xbfa20039, 0x8b79ff7a,
+	0xffff0000, 0xbf06ff79,
+	0xcc350000, 0xbfa20037,
+	0xbf06ff79, 0xcc3a0000,
+	0xbfa20034, 0xbf06ff76,
+	0xcc000000, 0xbfa10031,
+	0x8b76ff7b, 0x000001ff,
 	0xbf06ff76, 0x000000ff,
-	0xbfa20008, 0x8b76ff7b,
+	0xbfa20029, 0xbf06ff76,
+	0x000000fa, 0xbfa20026,
+	0x81f6ff76, 0x000000e9,
+	0xbf0b8176, 0xbfa20022,
+	0x8b76ff7b, 0x0003fe00,
+	0xbf06ff76, 0x0001fe00,
+	0xbfa2001d, 0x8b76ff7b,
+	0x07fc0000, 0xbf06ff76,
+	0x03fc0000, 0xbfa20018,
+	0xbfa00014, 0x9376ff7a,
+	0x00040016, 0x81f68176,
+	0xbf0b8176, 0xbfa20012,
+	0x9376ff7a, 0x00050011,
+	0x81f68176, 0xbf0b8176,
+	0xbfa2000d, 0x8b76ff7a,
 	0x000001ff, 0xbf06ff76,
-	0x000000ff, 0xbfa20003,
-	0xbfc70000, 0xbefb006e,
-	0xbfa0ffad, 0xbfc70000,
-	0xbefb006f, 0xbfa0ffaa,
-	0xbfc70000, 0xbeee007e,
-	0xbeef007f, 0xbefe0180,
-	0xbefe4d84, 0xbf8a0000,
-	0x8b7aff7f, 0x04000000,
-	0x847a857a, 0x8c6d7a6d,
-	0xb8eff822, 0xb980f822,
-	0x00000000, 0xb8fa2b01,
-	0x847a997a, 0x8c6d7a6d,
-	0xbefa0080, 0xb97a2b01,
-	0xbefa007e, 0x8b7bff7f,
-	0x01ffffff, 0xbefe00c1,
-	0xbeff00c1, 0xee0a407a,
-	0x000c0000, 0x00000000,
-	0x7e000280, 0xbefe007a,
-	0xbeff007b, 0xb8fb0742,
-	0x847b997b, 0xb8fa3b05,
-	0x807a817a, 0xbf0d997b,
-	0xbfa20002, 0x847a897a,
-	0xbfa00001, 0x847a8a7a,
+	0x000000ff, 0xbfa20008,
+	0x8b76ff7b, 0x000001ff,
+	0xbf06ff76, 0x000000ff,
+	0xbfa20003, 0xbfc70000,
+	0xbefb006e, 0xbfa0ffad,
+	0xbfc70000, 0xbefb006f,
+	0xbfa0ffaa, 0xbfc70000,
+	0xbeee007e, 0xbeef007f,
+	0xbefe0180, 0xbefe4d84,
+	0xbf8a0000, 0x8b7aff7f,
+	0x04000000, 0x847a857a,
+	0x8c6d7a6d, 0xb8eff822,
+	0xb980f822, 0x00000000,
+	0xb8fa2b01, 0x847a997a,
+	0x8c6d7a6d, 0xbefa0080,
+	0xb97a2b01, 0xbefa007e,
 	0x8b7bff7f, 0x01ffffff,
-	0x807aff7a, 0x000001c0,
-	0x807a7e7a, 0x827b807b,
-	0xd7610000, 0x00010870,
-	0xd7610000, 0x00010a71,
-	0xd7610000, 0x00010c72,
-	0xd7610000, 0x00010e73,
-	0xd7610000, 0x00011074,
-	0xd7610000, 0x00011275,
-	0xd7610000, 0x00011476,
-	0xd7610000, 0x00011677,
-	0xd7610000, 0x00011a79,
-	0xd7610000, 0x00011c7e,
-	0xd7610000, 0x00011e7f,
-	0xbefe00ff, 0x00003fff,
-	0xbeff0080, 0xee0a407a,
-	0x000c0000, 0x00000000,
-	0xd760007a, 0x00011d00,
-	0xd760007b, 0x00011f00,
+	0xbefe00c1, 0xbeff00c1,
+	0xee0a407a, 0x000c0000,
+	0x00000000, 0x7e000280,
 	0xbefe007a, 0xbeff007b,
-	0xbef4007e, 0x8b75ff7f,
-	0x01ffffff, 0xbef1007d,
-	0xb8f30742, 0x84739973,
-	0xbefe00c1, 0x857d9973,
-	0x8b7d817d, 0xbf06817d,
-	0xbfa20002, 0xbeff0080,
-	0xbfa00002, 0xbeff00c1,
-	0xbfa0000a, 0xee0a4074,
-	0x008c0000, 0x00008000,
-	0xee0a4074, 0x010c0000,
+	0xb8fb0742, 0x847b997b,
+	0xb8fa3b05, 0x807a817a,
+	0xbf0d997b, 0xbfa20002,
+	0x847a897a, 0xbfa00001,
+	0x847a8a7a, 0x8b7bff7f,
+	0x01ffffff, 0x807aff7a,
+	0x000001c0, 0x807a7e7a,
+	0x827b807b, 0xd7610000,
+	0x00010870, 0xd7610000,
+	0x00010a71, 0xd7610000,
+	0x00010c72, 0xd7610000,
+	0x00010e73, 0xd7610000,
+	0x00011074, 0xd7610000,
+	0x00011275, 0xd7610000,
+	0x00011476, 0xd7610000,
+	0x00011677, 0xd7610000,
+	0x00011a79, 0xd7610000,
+	0x00011c7e, 0xd7610000,
+	0x00011e7f, 0xbefe00ff,
+	0x00003fff, 0xbeff0080,
+	0xee0a407a, 0x000c0000,
+	0x00000000, 0xd760007a,
+	0x00011d00, 0xd760007b,
+	0x00011f00, 0xbefe007a,
+	0xbeff007b, 0xbef4007e,
+	0x8b75ff7f, 0x01ffffff,
+	0xbef1007d, 0xb8f30742,
+	0x84739973, 0xbefe00c1,
+	0x857d9973, 0x8b7d817d,
+	0xbf06817d, 0xbfa20002,
+	0xbeff0080, 0xbfa00002,
+	0xbeff00c1, 0xbfa0000a,
+	0xee0a4074, 0x008c0000,
+	0x00008000, 0xee0a4074,
+	0x010c0000, 0x00010000,
+	0xee0a4074, 0x018c0000,
+	0x00018000, 0xbfa00009,
+	0xee0a4074, 0x008c0000,
 	0x00010000, 0xee0a4074,
-	0x018c0000, 0x00018000,
-	0xbfa00009, 0xee0a4074,
-	0x008c0000, 0x00010000,
-	0xee0a4074, 0x010c0000,
-	0x00020000, 0xee0a4074,
-	0x018c0000, 0x00030000,
-	0xb8f03b05, 0x80708170,
-	0xbf0d9973, 0xbfa20002,
-	0x84708970, 0xbfa00001,
-	0x84708a70, 0x8070ff70,
-	0x00000200, 0x7e000280,
-	0x7e020280, 0x7e040280,
-	0xbefd0080, 0xd7610002,
-	0x0000fa71, 0x807d817d,
-	0xb8faf802, 0xbf0c8b7a,
-	0xbfa20003, 0xbe804fc2,
-	0xbf94fffe, 0xbfa10001,
-	0xbe804ec4, 0xbf94fffc,
-	0xbefa4c88, 0xbfc70000,
-	0xbf0c807a, 0xbfa20006,
-	0x9371ff7a, 0x00070004,
-	0x937aff7a, 0x00070010,
-	0xbf06717a, 0xbfa2fff6,
-	0xb8faf804, 0x8b7aff7a,
-	0x0001000c, 0x9178ff78,
-	0x0001000c, 0x8c787a78,
-	0xd7610002, 0x0000fa6c,
-	0x807d817d, 0x917aff6d,
-	0x80000000, 0xd7610002,
+	0x010c0000, 0x00020000,
+	0xee0a4074, 0x018c0000,
+	0x00030000, 0xb8f03b05,
+	0x80708170, 0xbf0d9973,
+	0xbfa20002, 0x84708970,
+	0xbfa00001, 0x84708a70,
+	0x8070ff70, 0x00000200,
+	0x7e000280, 0x7e020280,
+	0x7e040280, 0xbefd0080,
+	0xd7610002, 0x0000fa71,
+	0x807d817d, 0xb8faf802,
+	0xbf0c8b7a, 0xbfa20003,
+	0xbe804fc2, 0xbf94fffe,
+	0xbfa10001, 0xbe804ec4,
+	0xbf94fffc, 0xbefa4c88,
+	0xbfc70000, 0xbf0c807a,
+	0xbfa20006, 0x9371ff7a,
+	0x00070004, 0x937aff7a,
+	0x00070010, 0xbf06717a,
+	0xbfa2fff6, 0xb8faf804,
+	0x8b7aff7a, 0x0001000c,
+	0x9178ff78, 0x0001000c,
+	0x8c787a78, 0xd7610002,
+	0x0000fa6c, 0x807d817d,
+	0x917aff6d, 0x80000000,
+	0xd7610002, 0x0000fa7a,
+	0x807d817d, 0xd7610002,
+	0x0000fa6e, 0x807d817d,
+	0xbefa0080, 0xd7610002,
 	0x0000fa7a, 0x807d817d,
-	0xd7610002, 0x0000fa6e,
-	0x807d817d, 0xbefa0080,
+	0xd7610002, 0x0000fa78,
+	0x807d817d, 0xb8faf811,
 	0xd7610002, 0x0000fa7a,
 	0x807d817d, 0xd7610002,
-	0x0000fa78, 0x807d817d,
-	0xb8faf811, 0xd7610002,
+	0x0000fa6f, 0x807d817d,
+	0xb8f1f801, 0x937aff6d,
+	0x00060019, 0x847a8c7a,
+	0x8c717a71, 0xd7610002,
+	0x0000fa71, 0x807d817d,
+	0xb8f1f814, 0xd7610002,
+	0x0000fa71, 0x807d817d,
+	0xb8f1f815, 0xd7610002,
+	0x0000fa71, 0x807d817d,
+	0xb8f1f812, 0xd7610002,
+	0x0000fa71, 0x807d817d,
+	0xb8f1f813, 0xd7610002,
+	0x0000fa71, 0x807d817d,
+	0xb8faf802, 0xd7610002,
 	0x0000fa7a, 0x807d817d,
-	0xd7610002, 0x0000fa6f,
-	0x807d817d, 0xb8f1f801,
-	0x937aff6d, 0x00060019,
-	0x847a8c7a, 0x8c717a71,
-	0xd7610002, 0x0000fa71,
-	0x807d817d, 0xb8f1f814,
-	0xd7610002, 0x0000fa71,
-	0x807d817d, 0xb8f1f815,
-	0xd7610002, 0x0000fa71,
-	0x807d817d, 0xb8f1f812,
-	0xd7610002, 0x0000fa71,
-	0x807d817d, 0xb8f1f813,
-	0xd7610002, 0x0000fa71,
-	0x807d817d, 0xb8faf802,
+	0xbefa50c1, 0xbfc70000,
 	0xd7610002, 0x0000fa7a,
-	0x807d817d, 0xbefa50c1,
+	0x807d817d, 0xbefa4c88,
 	0xbfc70000, 0xd7610002,
 	0x0000fa7a, 0x807d817d,
-	0xbefa4c88, 0xbfc70000,
-	0xd7610002, 0x0000fa7a,
-	0x807d817d, 0xbefe00ff,
-	0x0000ffff, 0xbeff0080,
+	0xb8faf81a, 0xd7610002,
+	0x0000fa7a, 0x807d817d,
+	0xbefe00c1, 0xbeff0080,
 	0x80767074, 0x82778075,
 	0xee0a4076, 0x010c0000,
 	0x00000000, 0xbefe00c1,
@@ -5061,7 +5057,7 @@ static const uint32_t cwsr_trap_gfx12_1_0_hex[] = {
 	0x018c0000, 0x00030000,
 	0x807d847d, 0x8070ff70,
 	0x00000400, 0xbf0a7b7d,
-	0xbfa2ffe9, 0xbfa00183,
+	0xbfa2ffe9, 0xbfa00184,
 	0xbef4007e, 0x8b75ff7f,
 	0x01ffffff, 0xbef1007f,
 	0xb8f20742, 0x84729972,
@@ -5229,6 +5225,8 @@ static const uint32_t cwsr_trap_gfx12_1_0_hex[] = {
 	0x856e906e, 0x8b6e6e6e,
 	0xbfa10003, 0xbe804ec3,
 	0x816ec16e, 0xbfa0fffb,
+	0xf4601bbb, 0xf8000040,
+	0xbfc70000, 0xb96ef81a,
 	0xbefd006f, 0xbefe0070,
 	0xbeff0071, 0xb979f822,
 	0xb97b2011, 0x857b867b,
@@ -5248,19 +5246,17 @@ static const uint32_t cwsr_trap_gfx12_1_0_hex[] = {
 	0x856e8e77, 0xb96e3021,
 	0x8b6dff6d, 0x01ffffff,
 	0x8bfe7e7e, 0x8bea6a6a,
-	0x936eff77, 0x0002001a,
-	0xb96ef81a, 0xb97af804,
+	0xb97af804, 0xb8eef802,
+	0xbf0c8b6e, 0xbfa20003,
+	0xbe804fc2, 0xbf94fffe,
+	0xbfa10001, 0xbe804ec4,
+	0xbf94fffc, 0x857a897a,
+	0xb97a0244, 0xbe804a6c,
 	0xb8eef802, 0xbf0c8b6e,
 	0xbfa20003, 0xbe804fc2,
 	0xbf94fffe, 0xbfa10001,
 	0xbe804ec4, 0xbf94fffc,
-	0x857a897a, 0xb97a0244,
-	0xbe804a6c, 0xb8eef802,
-	0xbf0c8b6e, 0xbfa20003,
-	0xbe804fc2, 0xbf94fffe,
-	0xbfa10001, 0xbe804ec4,
-	0xbf94fffc, 0xbfb10000,
+	0xbfb10000, 0xbf9f0000,
 	0xbf9f0000, 0xbf9f0000,
 	0xbf9f0000, 0xbf9f0000,
-	0xbf9f0000, 0x00000000,
 };
diff --git a/drivers/gpu/drm/amd/amdkfd/cwsr_trap_handler_gfx12.asm b/drivers/gpu/drm/amd/amdkfd/cwsr_trap_handler_gfx12.asm
index ace2a9f2ac73..ccc61f60ceb3 100644
--- a/drivers/gpu/drm/amd/amdkfd/cwsr_trap_handler_gfx12.asm
+++ b/drivers/gpu/drm/amd/amdkfd/cwsr_trap_handler_gfx12.asm
@@ -36,6 +36,7 @@
 #define NUM_NAMED_BARRIERS (ASIC_FAMILY == CHIP_GC_12_0_3 ? 0x10 : 0)
 #define HAVE_CLUSTER_BARRIER (ASIC_FAMILY == CHIP_GC_12_0_3)
 #define CLUSTER_BARRIER_SERIALIZE_WORKAROUND (ASIC_FAMILY == CHIP_GC_12_0_3)
+#define RELAXED_SCHEDULING_IN_TRAP (ASIC_FAMILY == CHIP_GFX12)
 
 #define SINGLE_STEP_MISSED_WORKAROUND 1	//workaround for lost TRAP_AFTER_INST exception when SAVECTX raised
 #define HAVE_VALU_SGPR_HAZARD (ASIC_FAMILY == CHIP_GFX12)
@@ -110,9 +111,11 @@ var BARRIER_STATE_MEMBER_OFFSET			= 4
 var BARRIER_STATE_MEMBER_SIZE			= 7
 var BARRIER_STATE_VALID_OFFSET			= 0
 
+#if RELAXED_SCHEDULING_IN_TRAP
 var TTMP11_SCHED_MODE_SHIFT			= 26
 var TTMP11_SCHED_MODE_SIZE			= 2
 var TTMP11_SCHED_MODE_MASK			= 0xC000000
+#endif
 
 var NAMED_BARRIERS_SR_OFFSET_FROM_HWREG		= 0x80
 var S_BARRIER_INIT_MEMBERCNT_MASK		= 0x7F0000
@@ -223,18 +226,22 @@ L_JUMP_TO_RESTORE:
 	s_branch	L_RESTORE
 
 L_SKIP_RESTORE:
+#if RELAXED_SCHEDULING_IN_TRAP
 	// Assume most relaxed scheduling mode is set. Save and revert to normal mode.
 	s_getreg_b32	ttmp2, hwreg(HW_REG_WAVE_SCHED_MODE)
 	s_wait_alu	0
 	s_setreg_imm32_b32	hwreg(HW_REG_WAVE_SCHED_MODE, \
 		SQ_WAVE_SCHED_MODE_DEP_MODE_SHIFT, SQ_WAVE_SCHED_MODE_DEP_MODE_SIZE), 0
+#endif
 
 	s_getreg_b32	s_save_state_priv, hwreg(HW_REG_WAVE_STATE_PRIV)	//save STATUS since we will change SCC
 
+#if RELAXED_SCHEDULING_IN_TRAP
 	// Save SCHED_MODE[1:0] into ttmp11[27:26].
 	s_andn2_b32	ttmp11, ttmp11, TTMP11_SCHED_MODE_MASK
 	s_lshl_b32	ttmp2, ttmp2, TTMP11_SCHED_MODE_SHIFT
 	s_or_b32	ttmp11, ttmp11, ttmp2
+#endif
 
 	// Clear SPI_PRIO: do not save with elevated priority.
 	// Clear ECC_ERR: prevents SQC store and triggers FATAL_HALT if setreg'd.
@@ -316,7 +323,7 @@ L_FETCH_2ND_TRAP:
 	s_cbranch_scc0	L_NO_SIGN_EXTEND_TMA
 	s_or_b32	ttmp15, ttmp15, ~ADDRESS_HI32_MASK
 L_NO_SIGN_EXTEND_TMA:
-#if ASIC_FAMILY == CHIP_GFX12
+#if RELAXED_SCHEDULING_IN_TRAP
 	// Move SCHED_MODE[1:0] from ttmp11 to unused bits in ttmp1[27:26] (return PC_HI).
 	// The second-level trap will restore from ttmp1 for backwards compatibility.
 	s_and_b32	ttmp2, ttmp11, TTMP11_SCHED_MODE_MASK
@@ -382,8 +389,10 @@ L_EXIT_TRAP:
 	// Only restore fields which the trap handler changes.
 	s_lshr_b32	s_save_state_priv, s_save_state_priv, SQ_WAVE_STATE_PRIV_SCC_SHIFT
 
+#if RELAXED_SCHEDULING_IN_TRAP
 	// Assume relaxed scheduling mode after this point.
 	restore_sched_mode(ttmp2)
+#endif
 
 	s_setreg_b32	hwreg(HW_REG_WAVE_STATE_PRIV, SQ_WAVE_STATE_PRIV_SCC_SHIFT, \
 		SQ_WAVE_STATE_PRIV_POISON_ERR_SHIFT - SQ_WAVE_STATE_PRIV_SCC_SHIFT + 1), s_save_state_priv
@@ -591,8 +600,18 @@ L_SAVE_HWREG:
 	write_hwreg_to_v2(s_save_tmp)
 #endif
 
+#if ASIC_FAMILY >= CHIP_GC_12_0_3
+	s_getreg_b32	s_save_tmp, hwreg(HW_REG_WAVE_SCHED_MODE)
+	write_hwreg_to_v2(s_save_tmp)
+#endif
+
+#if ! SAVE_TTMPS_IN_SGPR_BLOCK
 	// Write HWREGs with 16 VGPR lanes. TTMPs occupy space after this.
 	s_mov_b32       exec_lo, 0xFFFF
+#else
+	// All 128 bytes are available for HWREGs.
+	s_mov_b32       exec_lo, 0xFFFFFFFF
+#endif
 	s_mov_b32	exec_hi, 0x0
 	s_add_u32	s_save_addr_lo, s_save_base_addr_lo, s_save_mem_offset
 	s_addc_u32	s_save_addr_hi, s_save_base_addr_hi, 0x0
@@ -1155,6 +1174,12 @@ L_SKIP_TRAP_CLUSTER_BARRIER_SIGNAL:
 L_SKIP_CLUSTER_BARRIER_RESTORE:
 #endif
 
+#if ASIC_FAMILY >= CHIP_GC_12_0_3
+	s_load_b32	s_restore_tmp, [s_restore_addr_lo, s_restore_addr_hi], null scope:SCOPE_SYS offset:0x40
+	s_wait_kmcnt	0
+	s_setreg_b32	hwreg(HW_REG_WAVE_SCHED_MODE), s_restore_tmp
+#endif
+
 	s_mov_b32	m0, s_restore_m0
 	s_mov_b32	exec_lo, s_restore_exec_lo
 	s_mov_b32	exec_hi, s_restore_exec_hi
@@ -1194,8 +1219,10 @@ L_SKIP_CLUSTER_BARRIER_RESTORE:
 	s_and_b64	exec, exec, exec					// Restore STATUS.EXECZ, not writable by s_setreg_b32
 	s_and_b64	vcc, vcc, vcc						// Restore STATUS.VCCZ, not writable by s_setreg_b32
 
+#if RELAXED_SCHEDULING_IN_TRAP
 	// Assume relaxed scheduling mode after this point.
 	restore_sched_mode(s_restore_tmp)
+#endif
 
 	s_setreg_b32	hwreg(HW_REG_WAVE_STATE_PRIV), s_restore_state_priv	// SCC is included, which is changed by previous salu
 
@@ -1347,11 +1374,12 @@ L_NOT_IN_CLUSTER:
 #endif
 end
 
-
+#if RELAXED_SCHEDULING_IN_TRAP
 function restore_sched_mode(s_tmp)
 	s_bfe_u32	s_tmp, ttmp11, (TTMP11_SCHED_MODE_SHIFT | (TTMP11_SCHED_MODE_SIZE << 0x10))
 	s_setreg_b32	hwreg(HW_REG_WAVE_SCHED_MODE), s_tmp
 end
+#endif
 
 function restore_barrier_signal_count(barrier_id)
 	// extract the saved signal count from s_restore_tmp
-- 
2.34.1


^ permalink raw reply related	[flat|nested] 14+ messages in thread

* [PATCH 5/5] drm/amdkfd: Do not include VGPR MSBs in saved PC during save
  2026-01-16 20:39 [PATCH 0/5] drm/amdkfd: Trap handler fixes and gfx12.1 support Jay Cornwall
                   ` (3 preceding siblings ...)
  2026-01-16 20:39 ` [PATCH 4/5] drm/amdkfd: gfx12.1 trap handler support for expert scheduling mode Jay Cornwall
@ 2026-01-16 20:39 ` Jay Cornwall
  4 siblings, 0 replies; 14+ messages in thread
From: Jay Cornwall @ 2026-01-16 20:39 UTC (permalink / raw)
  To: amd-gfx; +Cc: Lancelot Six, Alexey Kondratiev, Jay Cornwall, Vladimir Indic

From: Lancelot Six <lancelot.six@amd.com>

The current trap handler uses the top bits of ttmp1 to store a copy of
sq_wave_mode.*vgpr_msb (except for src2_vgpr_msb).  This is so the
effective values in sq_wave_mode can be cleared to ensure correct
behavior of the trap handler.

When saving sq_wave_mode, the trap handler correctly rebuilds the
expected value (with *vgpr_msb restored), so the save area is correct.
However, the PC itself is copied from ttmp[0:1], which contains the
wave's PC as well as the saved MSBs.

The debugger reads the PC from the save area and is confused when non-0
values from VGPR_MSBs are present.

This patch fixes this by saving the PC in the save area's PC slot, not
the composite of the PC and VGPR_MSBs.  On restore, the VGPR_MSBs are
restored from sq_wave_mode.

Signed-off-by: Lancelot Six <lancelot.six@amd.com>
Tested-by: Alexey Kondratiev <Alexey.Kondratiev@amd.com>
Reviewed-by: Jay Cornwall <jay.cornwall@amd.com>
Cc: Vladimir Indic <vladimir.indic@amd.com>
---
 drivers/gpu/drm/amd/amdkfd/cwsr_trap_handler.h         | 6 +++---
 drivers/gpu/drm/amd/amdkfd/cwsr_trap_handler_gfx12.asm | 2 +-
 2 files changed, 4 insertions(+), 4 deletions(-)

diff --git a/drivers/gpu/drm/amd/amdkfd/cwsr_trap_handler.h b/drivers/gpu/drm/amd/amdkfd/cwsr_trap_handler.h
index 9bb7fb6a83ed..39bdc98b8b6d 100644
--- a/drivers/gpu/drm/amd/amdkfd/cwsr_trap_handler.h
+++ b/drivers/gpu/drm/amd/amdkfd/cwsr_trap_handler.h
@@ -3760,8 +3760,8 @@ static const uint32_t cwsr_trap_gfx12_hex[] = {
 	0xb8faf804, 0x8b7a847a,
 	0x91788478, 0x8c787a78,
 	0xd7610002, 0x0000fa6c,
-	0x807d817d, 0x917aff6d,
-	0x80000000, 0xd7610002,
+	0x807d817d, 0x8b7aff6d,
+	0x0000ffff, 0xd7610002,
 	0x0000fa7a, 0x807d817d,
 	0xd7610002, 0x0000fa6e,
 	0x807d817d, 0xd7610002,
@@ -4848,7 +4848,7 @@ static const uint32_t cwsr_trap_gfx12_1_0_hex[] = {
 	0x9178ff78, 0x0001000c,
 	0x8c787a78, 0xd7610002,
 	0x0000fa6c, 0x807d817d,
-	0x917aff6d, 0x80000000,
+	0x8b7aff6d, 0x01ffffff,
 	0xd7610002, 0x0000fa7a,
 	0x807d817d, 0xd7610002,
 	0x0000fa6e, 0x807d817d,
diff --git a/drivers/gpu/drm/amd/amdkfd/cwsr_trap_handler_gfx12.asm b/drivers/gpu/drm/amd/amdkfd/cwsr_trap_handler_gfx12.asm
index ccc61f60ceb3..c33e7660d8f4 100644
--- a/drivers/gpu/drm/amd/amdkfd/cwsr_trap_handler_gfx12.asm
+++ b/drivers/gpu/drm/amd/amdkfd/cwsr_trap_handler_gfx12.asm
@@ -544,7 +544,7 @@ L_SAVE_HWREG:
 	s_or_b32	s_save_state_priv, s_save_state_priv, s_save_tmp
 
 	write_hwreg_to_v2(s_save_pc_lo)
-	s_andn2_b32	s_save_tmp, s_save_pc_hi, S_SAVE_PC_HI_FIRST_WAVE_MASK
+	s_and_b32       s_save_tmp, s_save_pc_hi, ADDRESS_HI32_MASK
 	write_hwreg_to_v2(s_save_tmp)
 	write_hwreg_to_v2(s_save_exec_lo)
 #if WAVE32_ONLY
-- 
2.34.1


^ permalink raw reply related	[flat|nested] 14+ messages in thread

* Re: [PATCH 1/5] drm/amdkfd: Sync trap handler binary with source
  2026-01-16 20:39 ` [PATCH 1/5] drm/amdkfd: Sync trap handler binary with source Jay Cornwall
@ 2026-01-20 22:34   ` Lancelot SIX
  2026-01-21 10:27     ` Indic, Vladimir
  0 siblings, 1 reply; 14+ messages in thread
From: Lancelot SIX @ 2026-01-20 22:34 UTC (permalink / raw)
  To: Jay Cornwall, amd-gfx; +Cc: Vladimir Indic

Hi,

This looks good to me, thanks.

On 16/01/2026 20:39, Jay Cornwall wrote:
> Binary and source desynced during branch activity. Source merge
> also introduced compile error.
> 
> Signed-off-by: Jay Cornwall <jay.cornwall@amd.com>
> Cc: Lancelot Six <lancelot.six@amd.com>
> Cc: Vladimir Indic <vladimir.indic@amd.com>

Reviewed-by: Lancelot Six<lancelot.six@amd.com>

^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: [PATCH 2/5] drm/amdkfd: Fix scalar load ordering in gfx12.1 trap handler
  2026-01-16 20:39 ` [PATCH 2/5] drm/amdkfd: Fix scalar load ordering in gfx12.1 trap handler Jay Cornwall
@ 2026-01-20 22:38   ` Lancelot SIX
  2026-01-21 10:32     ` Indic, Vladimir
  0 siblings, 1 reply; 14+ messages in thread
From: Lancelot SIX @ 2026-01-20 22:38 UTC (permalink / raw)
  To: Jay Cornwall, amd-gfx; +Cc: Joseph Greathouse, Vladimir Indic

Hi,

This looks good, thanks for fixing this.

Thanks,
Lancelot.

On 16/01/2026 20:39, Jay Cornwall wrote:
> Scalar loads may arrive out-of-order with respect to KMCNT.
> The affected code expects the two loads to arrive in-order.
> 
> Signed-off-by: Jay Cornwall <jay.cornwall@amd.com>
> Cc: Lancelot Six <lancelot.six@amd.com>
> Cc: Joseph Greathouse <joseph.greathouse@amd.com>
> Cc: Vladimir Indic <vladimir.indic@amd.com>
Reviewed-by: Lancelot Six <lancelot.six@amd.com>

^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: [PATCH 3/5] drm/amdkfd: gfx12.1 cluster barrier context save workaround
  2026-01-16 20:39 ` [PATCH 3/5] drm/amdkfd: gfx12.1 cluster barrier context save workaround Jay Cornwall
@ 2026-01-20 23:27   ` Lancelot SIX
  2026-01-21 10:37     ` Indic, Vladimir
  0 siblings, 1 reply; 14+ messages in thread
From: Lancelot SIX @ 2026-01-20 23:27 UTC (permalink / raw)
  To: Jay Cornwall, amd-gfx; +Cc: Gang Ba, Harish Kasiviswanathan, Vladimir Indic

Hi,

On 16/01/2026 20:39, Jay Cornwall wrote:
> Trap cluster barrier may not serialize with user cluster barrier
> under some circumstances. Add a check for pending user cluster
> barrier complete.
> 
> Signed-off-by: Jay Cornwall <jay.cornwall@amd.com>
> Tested-by: Gang Ba <Gang.Ba@amd.com>
> Cc: Harish Kasiviswanathan <Harish.Kasiviswanathan@amd.com>
> Cc: Lancelot Six <lancelot.six@amd.com>
> Cc: Vladimir Indic <vladimir.indic@amd.com>
To the best of my understanding, this looks OK. Thanks.

Best,
Lancelot.

Reviewed-by: Lancelot Six <lancelot.six@amd.com>

^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: [PATCH 4/5] drm/amdkfd: gfx12.1 trap handler support for expert scheduling mode
  2026-01-16 20:39 ` [PATCH 4/5] drm/amdkfd: gfx12.1 trap handler support for expert scheduling mode Jay Cornwall
@ 2026-01-20 23:30   ` Lancelot SIX
  2026-01-21 10:46     ` Indic, Vladimir
  0 siblings, 1 reply; 14+ messages in thread
From: Lancelot SIX @ 2026-01-20 23:30 UTC (permalink / raw)
  To: Jay Cornwall, amd-gfx; +Cc: Vladimir Indic

Hi,

Thanks, that looks good to me.  Thanks.

Best,
Lancelot.

On 16/01/2026 20:39, Jay Cornwall wrote:
> - Leave DEP_MODE unchanged as it is ignored in the trap handler
> - Save/restore SCHED_MODE (gfx12.0 saves in ttmp11)
> 
> Signed-off-by: Jay Cornwall <jay.cornwall@amd.com>
> Cc: Lancelot Six <lancelot.six@amd.com>
> Cc: Vladimir Indic <vladimir.indic@amd.com>

Reviewed-by: Lancelot Six <lancelot.six@amd.com>

^ permalink raw reply	[flat|nested] 14+ messages in thread

* RE: [PATCH 1/5] drm/amdkfd: Sync trap handler binary with source
  2026-01-20 22:34   ` Lancelot SIX
@ 2026-01-21 10:27     ` Indic, Vladimir
  0 siblings, 0 replies; 14+ messages in thread
From: Indic, Vladimir @ 2026-01-21 10:27 UTC (permalink / raw)
  To: Six, Lancelot, Cornwall, Jay, amd-gfx@lists.freedesktop.org

[AMD Official Use Only - AMD Internal Distribution Only]

Adding one more review, the patch LGTM, thanks!

Reviewed-by: Vladimir Indic<vladimir.indic@amd.com>

-----Original Message-----
From: Six, Lancelot <Lancelot.Six@amd.com>
Sent: Tuesday, January 20, 2026 11:35 PM
To: Cornwall, Jay <Jay.Cornwall@amd.com>; amd-gfx@lists.freedesktop.org
Cc: Indic, Vladimir <Vladimir.Indic@amd.com>
Subject: Re: [PATCH 1/5] drm/amdkfd: Sync trap handler binary with source

Hi,

This looks good to me, thanks.

On 16/01/2026 20:39, Jay Cornwall wrote:
> Binary and source desynced during branch activity. Source merge also
> introduced compile error.
>
> Signed-off-by: Jay Cornwall <jay.cornwall@amd.com>
> Cc: Lancelot Six <lancelot.six@amd.com>
> Cc: Vladimir Indic <vladimir.indic@amd.com>

Reviewed-by: Lancelot Six<lancelot.six@amd.com>

^ permalink raw reply	[flat|nested] 14+ messages in thread

* RE: [PATCH 2/5] drm/amdkfd: Fix scalar load ordering in gfx12.1 trap handler
  2026-01-20 22:38   ` Lancelot SIX
@ 2026-01-21 10:32     ` Indic, Vladimir
  0 siblings, 0 replies; 14+ messages in thread
From: Indic, Vladimir @ 2026-01-21 10:32 UTC (permalink / raw)
  To: Six, Lancelot, Cornwall, Jay, amd-gfx@lists.freedesktop.org
  Cc: Greathouse, Joseph

[AMD Official Use Only - AMD Internal Distribution Only]

Adding one more review. LGTM!

Reviewed-by: Vladimir Indic <vladimir.indic@amd.com>

-----Original Message-----
From: Six, Lancelot <Lancelot.Six@amd.com>
Sent: Tuesday, January 20, 2026 11:38 PM
To: Cornwall, Jay <Jay.Cornwall@amd.com>; amd-gfx@lists.freedesktop.org
Cc: Greathouse, Joseph <Joseph.Greathouse@amd.com>; Indic, Vladimir <Vladimir.Indic@amd.com>
Subject: Re: [PATCH 2/5] drm/amdkfd: Fix scalar load ordering in gfx12.1 trap handler

Hi,

This looks good, thanks for fixing this.

Thanks,
Lancelot.

On 16/01/2026 20:39, Jay Cornwall wrote:
> Scalar loads may arrive out-of-order with respect to KMCNT.
> The affected code expects the two loads to arrive in-order.
>
> Signed-off-by: Jay Cornwall <jay.cornwall@amd.com>
> Cc: Lancelot Six <lancelot.six@amd.com>
> Cc: Joseph Greathouse <joseph.greathouse@amd.com>
> Cc: Vladimir Indic <vladimir.indic@amd.com>
Reviewed-by: Lancelot Six <lancelot.six@amd.com>

^ permalink raw reply	[flat|nested] 14+ messages in thread

* RE: [PATCH 3/5] drm/amdkfd: gfx12.1 cluster barrier context save workaround
  2026-01-20 23:27   ` Lancelot SIX
@ 2026-01-21 10:37     ` Indic, Vladimir
  0 siblings, 0 replies; 14+ messages in thread
From: Indic, Vladimir @ 2026-01-21 10:37 UTC (permalink / raw)
  To: Six, Lancelot, Cornwall, Jay, amd-gfx@lists.freedesktop.org
  Cc: Ba, Gang, Kasiviswanathan, Harish

[AMD Official Use Only - AMD Internal Distribution Only]

One more
Reviewed-by: Vladimir Indic <vladimir.indic@amd.com>

-----Original Message-----
From: Six, Lancelot <Lancelot.Six@amd.com>
Sent: Wednesday, January 21, 2026 12:27 AM
To: Cornwall, Jay <Jay.Cornwall@amd.com>; amd-gfx@lists.freedesktop.org
Cc: Ba, Gang <Gang.Ba@amd.com>; Kasiviswanathan, Harish <Harish.Kasiviswanathan@amd.com>; Indic, Vladimir <Vladimir.Indic@amd.com>
Subject: Re: [PATCH 3/5] drm/amdkfd: gfx12.1 cluster barrier context save workaround

Hi,

On 16/01/2026 20:39, Jay Cornwall wrote:
> Trap cluster barrier may not serialize with user cluster barrier under
> some circumstances. Add a check for pending user cluster barrier
> complete.
>
> Signed-off-by: Jay Cornwall <jay.cornwall@amd.com>
> Tested-by: Gang Ba <Gang.Ba@amd.com>
> Cc: Harish Kasiviswanathan <Harish.Kasiviswanathan@amd.com>
> Cc: Lancelot Six <lancelot.six@amd.com>
> Cc: Vladimir Indic <vladimir.indic@amd.com>
To the best of my understanding, this looks OK. Thanks.

Best,
Lancelot.

Reviewed-by: Lancelot Six <lancelot.six@amd.com>

^ permalink raw reply	[flat|nested] 14+ messages in thread

* RE: [PATCH 4/5] drm/amdkfd: gfx12.1 trap handler support for expert scheduling mode
  2026-01-20 23:30   ` Lancelot SIX
@ 2026-01-21 10:46     ` Indic, Vladimir
  0 siblings, 0 replies; 14+ messages in thread
From: Indic, Vladimir @ 2026-01-21 10:46 UTC (permalink / raw)
  To: Six, Lancelot, Cornwall, Jay, amd-gfx@lists.freedesktop.org

[AMD Official Use Only - AMD Internal Distribution Only]

On more review, LGTM! Thanks!

Reviewed-by: Vladimir Indic <vladimir.indic@amd.com>

-----Original Message-----
From: Six, Lancelot <Lancelot.Six@amd.com>
Sent: Wednesday, January 21, 2026 12:30 AM
To: Cornwall, Jay <Jay.Cornwall@amd.com>; amd-gfx@lists.freedesktop.org
Cc: Indic, Vladimir <Vladimir.Indic@amd.com>
Subject: Re: [PATCH 4/5] drm/amdkfd: gfx12.1 trap handler support for expert scheduling mode

Hi,

Thanks, that looks good to me.  Thanks.

Best,
Lancelot.

On 16/01/2026 20:39, Jay Cornwall wrote:
> - Leave DEP_MODE unchanged as it is ignored in the trap handler
> - Save/restore SCHED_MODE (gfx12.0 saves in ttmp11)
>
> Signed-off-by: Jay Cornwall <jay.cornwall@amd.com>
> Cc: Lancelot Six <lancelot.six@amd.com>
> Cc: Vladimir Indic <vladimir.indic@amd.com>

Reviewed-by: Lancelot Six <lancelot.six@amd.com>

^ permalink raw reply	[flat|nested] 14+ messages in thread

end of thread, other threads:[~2026-01-21 10:46 UTC | newest]

Thread overview: 14+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2026-01-16 20:39 [PATCH 0/5] drm/amdkfd: Trap handler fixes and gfx12.1 support Jay Cornwall
2026-01-16 20:39 ` [PATCH 1/5] drm/amdkfd: Sync trap handler binary with source Jay Cornwall
2026-01-20 22:34   ` Lancelot SIX
2026-01-21 10:27     ` Indic, Vladimir
2026-01-16 20:39 ` [PATCH 2/5] drm/amdkfd: Fix scalar load ordering in gfx12.1 trap handler Jay Cornwall
2026-01-20 22:38   ` Lancelot SIX
2026-01-21 10:32     ` Indic, Vladimir
2026-01-16 20:39 ` [PATCH 3/5] drm/amdkfd: gfx12.1 cluster barrier context save workaround Jay Cornwall
2026-01-20 23:27   ` Lancelot SIX
2026-01-21 10:37     ` Indic, Vladimir
2026-01-16 20:39 ` [PATCH 4/5] drm/amdkfd: gfx12.1 trap handler support for expert scheduling mode Jay Cornwall
2026-01-20 23:30   ` Lancelot SIX
2026-01-21 10:46     ` Indic, Vladimir
2026-01-16 20:39 ` [PATCH 5/5] drm/amdkfd: Do not include VGPR MSBs in saved PC during save Jay Cornwall

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox