* [PATCH 0/5] drm/amdkfd: Trap handler fixes and gfx12.1 support
@ 2026-01-16 20:39 Jay Cornwall
2026-01-16 20:39 ` [PATCH 1/5] drm/amdkfd: Sync trap handler binary with source Jay Cornwall
` (4 more replies)
0 siblings, 5 replies; 14+ messages in thread
From: Jay Cornwall @ 2026-01-16 20:39 UTC (permalink / raw)
To: amd-gfx; +Cc: Jay Cornwall
Fix a broken merge and upstream missing gfx12.1 changes.
Jay Cornwall (4):
drm/amdkfd: Sync trap handler binary with source
drm/amdkfd: Fix scalar load ordering in gfx12.1 trap handler
drm/amdkfd: gfx12.1 cluster barrier context save workaround
drm/amdkfd: gfx12.1 trap handler support for expert scheduling mode
Lancelot Six (1):
drm/amdkfd: Do not include VGPR MSBs in saved PC during save
.../gpu/drm/amd/amdkfd/cwsr_trap_handler.h | 1435 ++++++++---------
.../amd/amdkfd/cwsr_trap_handler_gfx12.asm | 73 +-
2 files changed, 744 insertions(+), 764 deletions(-)
--
2.34.1
^ permalink raw reply [flat|nested] 14+ messages in thread
* [PATCH 1/5] drm/amdkfd: Sync trap handler binary with source
2026-01-16 20:39 [PATCH 0/5] drm/amdkfd: Trap handler fixes and gfx12.1 support Jay Cornwall
@ 2026-01-16 20:39 ` Jay Cornwall
2026-01-20 22:34 ` Lancelot SIX
2026-01-16 20:39 ` [PATCH 2/5] drm/amdkfd: Fix scalar load ordering in gfx12.1 trap handler Jay Cornwall
` (3 subsequent siblings)
4 siblings, 1 reply; 14+ messages in thread
From: Jay Cornwall @ 2026-01-16 20:39 UTC (permalink / raw)
To: amd-gfx; +Cc: Jay Cornwall, Lancelot Six, Vladimir Indic
Binary and source desynced during branch activity. Source merge
also introduced compile error.
Signed-off-by: Jay Cornwall <jay.cornwall@amd.com>
Cc: Lancelot Six <lancelot.six@amd.com>
Cc: Vladimir Indic <vladimir.indic@amd.com>
---
.../gpu/drm/amd/amdkfd/cwsr_trap_handler.h | 1742 ++++++++---------
.../amd/amdkfd/cwsr_trap_handler_gfx12.asm | 1 +
2 files changed, 836 insertions(+), 907 deletions(-)
diff --git a/drivers/gpu/drm/amd/amdkfd/cwsr_trap_handler.h b/drivers/gpu/drm/amd/amdkfd/cwsr_trap_handler.h
index dfffda4aa8e2..6281b2f9faee 100644
--- a/drivers/gpu/drm/amd/amdkfd/cwsr_trap_handler.h
+++ b/drivers/gpu/drm/amd/amdkfd/cwsr_trap_handler.h
@@ -3644,7 +3644,7 @@ static const uint32_t cwsr_trap_gfx9_4_3_hex[] = {
};
static const uint32_t cwsr_trap_gfx12_hex[] = {
- 0xbfa00001, 0xbfa002b2,
+ 0xbfa00001, 0xbfa00239,
0xb0804009, 0xb8eef81a,
0xbf880000, 0xb980081a,
0x00000000, 0xb8f8f804,
@@ -3711,464 +3711,385 @@ static const uint32_t cwsr_trap_gfx12_hex[] = {
0x807a817a, 0xbf0d997b,
0xbfa20002, 0x847a897a,
0xbfa00001, 0x847a8a7a,
- 0xb8fb1e06, 0x847b8a7b,
- 0x807a7b7a, 0x8b7bff7f,
- 0x0000ffff, 0x807aff7a,
- 0x00000200, 0x807a7e7a,
- 0x827b807b, 0xd7610000,
- 0x00010870, 0xd7610000,
- 0x00010a71, 0xd7610000,
- 0x00010c72, 0xd7610000,
- 0x00010e73, 0xd7610000,
- 0x00011074, 0xd7610000,
- 0x00011275, 0xd7610000,
- 0x00011476, 0xd7610000,
- 0x00011677, 0xd7610000,
- 0x00011a79, 0xd7610000,
- 0x00011c7e, 0xd7610000,
- 0x00011e7f, 0xd8500000,
- 0x00000000, 0xd8500000,
- 0x00000000, 0xd8500000,
- 0x00000000, 0xd8500000,
- 0x00000000, 0xd8500000,
- 0x00000000, 0xd8500000,
- 0x00000000, 0xd8500000,
- 0x00000000, 0xd8500000,
- 0x00000000, 0xbefe00ff,
- 0x00003fff, 0xbeff0080,
- 0xee0a407a, 0x000c0000,
- 0x00004000, 0xd760007a,
- 0x00011d00, 0xd760007b,
- 0x00011f00, 0xbefe007a,
- 0xbeff007b, 0xbef4007e,
- 0x8b75ff7f, 0x0000ffff,
- 0x8c75ff75, 0x00040000,
- 0xbef60080, 0xbef700ff,
- 0x10807fac, 0xbef1007d,
- 0xbef00080, 0xb8f30742,
- 0x84739973, 0xbefe00c1,
- 0x857d9973, 0x8b7d817d,
- 0xbf06817d, 0xbfa20002,
- 0xbeff0080, 0xbfa00002,
- 0xbeff00c1, 0xbfa0000c,
- 0xbef600ff, 0x01000000,
- 0xc4068070, 0x008ce801,
- 0x00008000, 0xc4068070,
- 0x008ce802, 0x00010000,
- 0xc4068070, 0x008ce803,
- 0x00018000, 0xbfa0000b,
- 0xbef600ff, 0x01000000,
- 0xc4068070, 0x008ce801,
- 0x00010000, 0xc4068070,
- 0x008ce802, 0x00020000,
- 0xc4068070, 0x008ce803,
- 0x00030000, 0xb8f03b05,
- 0x80708170, 0xbf0d9973,
- 0xbfa20002, 0x84708970,
- 0xbfa00001, 0x84708a70,
- 0xb8fa1e06, 0x847a8a7a,
- 0x80707a70, 0x8070ff70,
- 0x00000200, 0xbef600ff,
- 0x01000000, 0x7e000280,
- 0x7e020280, 0x7e040280,
- 0xbe804ec2, 0xbf94fffe,
- 0xb8faf804, 0x8b7a847a,
- 0x91788478, 0x8c787a78,
- 0x917aff6d, 0x80000000,
- 0xd7610002, 0x00010071,
- 0xd7610002, 0x0001026c,
- 0xd7610002, 0x0001047a,
- 0xd7610002, 0x0001066e,
- 0xd7610002, 0x0001086f,
- 0xd7610002, 0x00010a78,
- 0xd7610002, 0x00010e7b,
- 0xd8500000, 0x00000000,
- 0xd8500000, 0x00000000,
- 0xd8500000, 0x00000000,
- 0xd8500000, 0x00000000,
- 0xd8500000, 0x00000000,
- 0xd8500000, 0x00000000,
- 0xd8500000, 0x00000000,
- 0xd8500000, 0x00000000,
- 0xb8faf811, 0xd7610002,
- 0x00010c7a, 0xb8faf801,
- 0xd7610002, 0x0001107a,
- 0xb8faf814, 0xd7610002,
- 0x0001127a, 0xb8faf815,
- 0xd7610002, 0x0001147a,
- 0xb8faf812, 0xd7610002,
- 0x0001167a, 0xb8faf813,
- 0xd7610002, 0x0001187a,
- 0xb8faf802, 0xd7610002,
- 0x00011a7a, 0xbefa50c1,
- 0xbfc70000, 0xd7610002,
- 0x00011c7a, 0xd8500000,
- 0x00000000, 0xd8500000,
- 0x00000000, 0xd8500000,
- 0x00000000, 0xd8500000,
- 0x00000000, 0xd8500000,
- 0x00000000, 0xd8500000,
- 0x00000000, 0xd8500000,
- 0x00000000, 0xd8500000,
- 0x00000000, 0xbefe00ff,
- 0x0000ffff, 0xbeff0080,
- 0xc4068070, 0x008ce802,
- 0x00000000, 0xbefe00c1,
+ 0x8b7bff7f, 0x0000ffff,
+ 0x807aff7a, 0x00000240,
+ 0x807a7e7a, 0x827b807b,
+ 0xd7610000, 0x00010870,
+ 0xd7610000, 0x00010a71,
+ 0xd7610000, 0x00010c72,
+ 0xd7610000, 0x00010e73,
+ 0xd7610000, 0x00011074,
+ 0xd7610000, 0x00011275,
+ 0xd7610000, 0x00011476,
+ 0xd7610000, 0x00011677,
+ 0xd7610000, 0x00011a79,
+ 0xd7610000, 0x00011c7e,
+ 0xd7610000, 0x00011e7f,
+ 0xbefe00ff, 0x00003fff,
+ 0xbeff0080, 0xee0a407a,
+ 0x000c0000, 0x00000000,
+ 0xd760007a, 0x00011d00,
+ 0xd760007b, 0x00011f00,
+ 0xbefe007a, 0xbeff007b,
+ 0xbef4007e, 0x8b75ff7f,
+ 0x0000ffff, 0xbef1007d,
+ 0xb8f30742, 0x84739973,
+ 0xbefe00c1, 0x857d9973,
+ 0x8b7d817d, 0xbf06817d,
+ 0xbfa20002, 0xbeff0080,
+ 0xbfa00002, 0xbeff00c1,
+ 0xbfa0000a, 0xee0a4074,
+ 0x008c0000, 0x00008000,
+ 0xee0a4074, 0x010c0000,
+ 0x00010000, 0xee0a4074,
+ 0x018c0000, 0x00018000,
+ 0xbfa00009, 0xee0a4074,
+ 0x008c0000, 0x00010000,
+ 0xee0a4074, 0x010c0000,
+ 0x00020000, 0xee0a4074,
+ 0x018c0000, 0x00030000,
0xb8f03b05, 0x80708170,
0xbf0d9973, 0xbfa20002,
0x84708970, 0xbfa00001,
- 0x84708a70, 0xb8fa1e06,
- 0x847a8a7a, 0x80707a70,
- 0xbef600ff, 0x01000000,
+ 0x84708a70, 0x8070ff70,
+ 0x00000200, 0x7e000280,
+ 0x7e020280, 0x7e040280,
+ 0xbefd0080, 0xbe804ec2,
+ 0xbf94fffe, 0xb8faf804,
+ 0x8b7a847a, 0x91788478,
+ 0x8c787a78, 0xd7610002,
+ 0x0000fa71, 0x807d817d,
+ 0xd7610002, 0x0000fa6c,
+ 0x807d817d, 0x917aff6d,
+ 0x80000000, 0xd7610002,
+ 0x0000fa7a, 0x807d817d,
+ 0xd7610002, 0x0000fa6e,
+ 0x807d817d, 0xd7610002,
+ 0x0000fa6f, 0x807d817d,
+ 0xd7610002, 0x0000fa78,
+ 0x807d817d, 0xb8faf811,
+ 0xd7610002, 0x0000fa7a,
+ 0x807d817d, 0xbefa0080,
+ 0xd7610002, 0x0000fa7a,
+ 0x807d817d, 0xb8f1f801,
+ 0xd7610002, 0x0000fa71,
+ 0x807d817d, 0xb8f1f814,
+ 0xd7610002, 0x0000fa71,
+ 0x807d817d, 0xb8f1f815,
+ 0xd7610002, 0x0000fa71,
+ 0x807d817d, 0xb8f1f812,
+ 0xd7610002, 0x0000fa71,
+ 0x807d817d, 0xb8f1f813,
+ 0xd7610002, 0x0000fa71,
+ 0x807d817d, 0xb8faf802,
+ 0xd7610002, 0x0000fa7a,
+ 0x807d817d, 0xbefa50c1,
+ 0xbfc70000, 0xd7610002,
+ 0x0000fa7a, 0x807d817d,
+ 0xbefe00ff, 0x0000ffff,
+ 0xbeff0080, 0x80767074,
+ 0x82778075, 0xee0a4076,
+ 0x010c0000, 0x00000000,
+ 0xbefe00c1, 0xb8f03b05,
+ 0x80708170, 0xbf0d9973,
+ 0xbfa20002, 0x84708970,
+ 0xbfa00001, 0x84708a70,
0xbef90080, 0xbefd0080,
0xbf800000, 0xbe804100,
0xbe824102, 0xbe844104,
0xbe864106, 0xbe884108,
0xbe8a410a, 0xbe8c410c,
- 0xbe8e410e, 0xbf068079,
- 0xbfa10032, 0xd7610002,
- 0x00010000, 0xd7610002,
- 0x00010201, 0xd7610002,
- 0x00010402, 0xd7610002,
- 0x00010603, 0xd7610002,
- 0x00010804, 0xd7610002,
- 0x00010a05, 0xd7610002,
- 0x00010c06, 0xd7610002,
- 0x00010e07, 0xd7610002,
- 0x00011008, 0xd7610002,
- 0x00011209, 0xd7610002,
- 0x0001140a, 0xd7610002,
- 0x0001160b, 0xd7610002,
- 0x0001180c, 0xd7610002,
- 0x00011a0d, 0xd7610002,
- 0x00011c0e, 0xd7610002,
- 0x00011e0f, 0xd8500000,
- 0x00000000, 0xd8500000,
- 0x00000000, 0xd8500000,
- 0x00000000, 0xd8500000,
- 0x00000000, 0xd8500000,
- 0x00000000, 0xd8500000,
- 0x00000000, 0xd8500000,
- 0x00000000, 0xd8500000,
- 0x00000000, 0x80799079,
- 0xbfa00038, 0xd7610002,
- 0x00012000, 0xd7610002,
- 0x00012201, 0xd7610002,
- 0x00012402, 0xd7610002,
- 0x00012603, 0xd7610002,
- 0x00012804, 0xd7610002,
- 0x00012a05, 0xd7610002,
- 0x00012c06, 0xd7610002,
- 0x00012e07, 0xd7610002,
- 0x00013008, 0xd7610002,
- 0x00013209, 0xd7610002,
- 0x0001340a, 0xd7610002,
- 0x0001360b, 0xd7610002,
- 0x0001380c, 0xd7610002,
- 0x00013a0d, 0xd7610002,
- 0x00013c0e, 0xd7610002,
- 0x00013e0f, 0xd8500000,
- 0x00000000, 0xd8500000,
- 0x00000000, 0xd8500000,
- 0x00000000, 0xd8500000,
- 0x00000000, 0xd8500000,
- 0x00000000, 0xd8500000,
- 0x00000000, 0xd8500000,
- 0x00000000, 0xd8500000,
- 0x00000000, 0x80799079,
- 0xc4068070, 0x008ce802,
- 0x00000000, 0x8070ff70,
- 0x00000080, 0xbef90080,
- 0x7e040280, 0x807d907d,
- 0xbf0aff7d, 0x00000060,
- 0xbfa2ff88, 0xbe804100,
- 0xbe824102, 0xbe844104,
- 0xbe864106, 0xbe884108,
- 0xbe8a410a, 0xd7610002,
- 0x00010000, 0xd7610002,
- 0x00010201, 0xd7610002,
- 0x00010402, 0xd7610002,
- 0x00010603, 0xd7610002,
- 0x00010804, 0xd7610002,
- 0x00010a05, 0xd7610002,
- 0x00010c06, 0xd7610002,
- 0x00010e07, 0xd7610002,
- 0x00011008, 0xd7610002,
- 0x00011209, 0xd7610002,
- 0x0001140a, 0xd7610002,
- 0x0001160b, 0xd8500000,
- 0x00000000, 0xd8500000,
- 0x00000000, 0xd8500000,
- 0x00000000, 0xd8500000,
- 0x00000000, 0xd8500000,
- 0x00000000, 0xd8500000,
- 0x00000000, 0xd8500000,
- 0x00000000, 0xd8500000,
- 0x00000000, 0xc4068070,
- 0x008ce802, 0x00000000,
- 0xbefe00c1, 0x857d9973,
- 0x8b7d817d, 0xbf06817d,
- 0xbfa20002, 0xbeff0080,
- 0xbfa00001, 0xbeff00c1,
- 0xb8fb4306, 0x8b7bc17b,
- 0xbfa10044, 0x8b7aff6d,
- 0x80000000, 0xbfa10041,
- 0x847b897b, 0xbef6007b,
+ 0xbe8e410e, 0xd7610002,
+ 0x0000f200, 0x80798179,
+ 0xd7610002, 0x0000f201,
+ 0x80798179, 0xd7610002,
+ 0x0000f202, 0x80798179,
+ 0xd7610002, 0x0000f203,
+ 0x80798179, 0xd7610002,
+ 0x0000f204, 0x80798179,
+ 0xd7610002, 0x0000f205,
+ 0x80798179, 0xd7610002,
+ 0x0000f206, 0x80798179,
+ 0xd7610002, 0x0000f207,
+ 0x80798179, 0xd7610002,
+ 0x0000f208, 0x80798179,
+ 0xd7610002, 0x0000f209,
+ 0x80798179, 0xd7610002,
+ 0x0000f20a, 0x80798179,
+ 0xd7610002, 0x0000f20b,
+ 0x80798179, 0xd7610002,
+ 0x0000f20c, 0x80798179,
+ 0xd7610002, 0x0000f20d,
+ 0x80798179, 0xd7610002,
+ 0x0000f20e, 0x80798179,
+ 0xd7610002, 0x0000f20f,
+ 0x80798179, 0xbf06a079,
+ 0xbfa10009, 0x80767074,
+ 0x82778075, 0xee0a4076,
+ 0x010c0000, 0x00000000,
+ 0x8070ff70, 0x00000080,
+ 0xbef90080, 0x7e040280,
+ 0x807d907d, 0xbf0aff7d,
+ 0x00000060, 0xbfa2ffb9,
+ 0xbe804100, 0xbe824102,
+ 0xbe844104, 0xbe864106,
+ 0xbe884108, 0xbe8a410a,
+ 0xd7610002, 0x0000f200,
+ 0x80798179, 0xd7610002,
+ 0x0000f201, 0x80798179,
+ 0xd7610002, 0x0000f202,
+ 0x80798179, 0xd7610002,
+ 0x0000f203, 0x80798179,
+ 0xd7610002, 0x0000f204,
+ 0x80798179, 0xd7610002,
+ 0x0000f205, 0x80798179,
+ 0xd7610002, 0x0000f206,
+ 0x80798179, 0xd7610002,
+ 0x0000f207, 0x80798179,
+ 0xd7610002, 0x0000f208,
+ 0x80798179, 0xd7610002,
+ 0x0000f209, 0x80798179,
+ 0xd7610002, 0x0000f20a,
+ 0x80798179, 0xd7610002,
+ 0x0000f20b, 0x80798179,
+ 0x80767074, 0x82778075,
+ 0xee0a4076, 0x010c0000,
+ 0x00000000, 0xbefe00c1,
+ 0x857d9973, 0x8b7d817d,
+ 0xbf06817d, 0xbfa20002,
+ 0xbeff0080, 0xbfa00001,
+ 0xbeff00c1, 0xb8fb4306,
+ 0x8b7bc17b, 0xbfa10042,
+ 0x8b7aff6d, 0x80000000,
+ 0xbfa1003f, 0x847b897b,
0xb8f03b05, 0x80708170,
0xbf0d9973, 0xbfa20002,
0x84708970, 0xbfa00001,
- 0x84708a70, 0xb8fa1e06,
- 0x847a8a7a, 0x80707a70,
- 0x8070ff70, 0x00000200,
- 0x8070ff70, 0x00000080,
- 0xbef600ff, 0x01000000,
- 0xd71f0000, 0x000100c1,
- 0xd7200000, 0x000200c1,
- 0x16000084, 0x857d9973,
- 0x8b7d817d, 0xbf06817d,
- 0xbefd0080, 0xbfa20013,
- 0xbe8300ff, 0x00000080,
+ 0x84708a70, 0x8070ff70,
+ 0x00000200, 0x8070ff70,
+ 0x00000080, 0xd71f0000,
+ 0x000100c1, 0xd7200000,
+ 0x000200c1, 0x16000084,
+ 0x857d9973, 0x8b7d817d,
+ 0xbf06817d, 0xbefd0080,
+ 0xbfa20015, 0xbe8300ff,
+ 0x00000080, 0xbf800000,
+ 0xbf800000, 0xbf800000,
+ 0xd8d80000, 0x01000000,
+ 0xbf8a0000, 0x80767074,
+ 0x82778075, 0xee0a4076,
+ 0x008c0000, 0x00000000,
+ 0x807d037d, 0x80700370,
+ 0xd5250000, 0x0001ff00,
+ 0x00000080, 0xbf0a7b7d,
+ 0xbfa2fff1, 0xbfa00014,
+ 0xbe8300ff, 0x00000100,
0xbf800000, 0xbf800000,
0xbf800000, 0xd8d80000,
0x01000000, 0xbf8a0000,
- 0xc4068070, 0x008ce801,
+ 0x80767074, 0x82778075,
+ 0xee0a4076, 0x008c0000,
0x00000000, 0x807d037d,
0x80700370, 0xd5250000,
- 0x0001ff00, 0x00000080,
- 0xbf0a7b7d, 0xbfa2fff3,
- 0xbfa00012, 0xbe8300ff,
- 0x00000100, 0xbf800000,
- 0xbf800000, 0xbf800000,
- 0xd8d80000, 0x01000000,
- 0xbf8a0000, 0xc4068070,
- 0x008ce801, 0x00000000,
- 0x807d037d, 0x80700370,
- 0xd5250000, 0x0001ff00,
- 0x00000100, 0xbf0a7b7d,
- 0xbfa2fff3, 0xbefe00c1,
- 0x857d9973, 0x8b7d817d,
- 0xbf06817d, 0xbfa20004,
- 0xbef000ff, 0x00000200,
- 0xbeff0080, 0xbfa00003,
- 0xbef000ff, 0x00000400,
- 0xbeff00c1, 0xb8fb3b05,
- 0x807b817b, 0x847b827b,
- 0x857d9973, 0x8b7d817d,
- 0xbf06817d, 0xbfa2001b,
- 0xbef600ff, 0x01000000,
- 0xbefd0084, 0xbf0a7b7d,
- 0xbfa10040, 0x7e008700,
- 0x7e028701, 0x7e048702,
- 0x7e068703, 0xc4068070,
- 0x008ce800, 0x00000000,
- 0xc4068070, 0x008ce801,
- 0x00008000, 0xc4068070,
- 0x008ce802, 0x00010000,
- 0xc4068070, 0x008ce803,
- 0x00018000, 0x807d847d,
- 0x8070ff70, 0x00000200,
- 0xbf0a7b7d, 0xbfa2ffeb,
- 0xbfa0002a, 0xbef600ff,
- 0x01000000, 0xbefd0084,
- 0xbf0a7b7d, 0xbfa10015,
+ 0x0001ff00, 0x00000100,
+ 0xbf0a7b7d, 0xbfa2fff1,
+ 0xbefe00c1, 0x857d9973,
+ 0x8b7d817d, 0xbf06817d,
+ 0xbfa20004, 0xbef000ff,
+ 0x00000200, 0xbeff0080,
+ 0xbfa00003, 0xbef000ff,
+ 0x00000400, 0xbeff00c1,
+ 0xb8fb3b05, 0x807b817b,
+ 0x847b827b, 0x857d9973,
+ 0x8b7d817d, 0xbf06817d,
+ 0xbfa2001b, 0xbefd0084,
+ 0xbf0a7b7d, 0xbfa10032,
0x7e008700, 0x7e028701,
0x7e048702, 0x7e068703,
- 0xc4068070, 0x008ce800,
- 0x00000000, 0xc4068070,
- 0x008ce801, 0x00010000,
- 0xc4068070, 0x008ce802,
- 0x00020000, 0xc4068070,
- 0x008ce803, 0x00030000,
+ 0x80767074, 0x82778075,
+ 0xee0a4076, 0x000c0000,
+ 0x00000000, 0xee0a4076,
+ 0x008c0000, 0x00008000,
+ 0xee0a4076, 0x010c0000,
+ 0x00010000, 0xee0a4076,
+ 0x018c0000, 0x00018000,
0x807d847d, 0x8070ff70,
- 0x00000400, 0xbf0a7b7d,
- 0xbfa2ffeb, 0xb8fb1e06,
- 0x8b7bc17b, 0xbfa1000d,
- 0x847b837b, 0x807b7d7b,
- 0xbefe00c1, 0xbeff0080,
- 0x7e008700, 0xc4068070,
- 0x008ce800, 0x00000000,
- 0x807d817d, 0x8070ff70,
- 0x00000080, 0xbf0a7b7d,
- 0xbfa2fff7, 0xbfa00171,
- 0xbef4007e, 0x8b75ff7f,
- 0x0000ffff, 0x8c75ff75,
- 0x00040000, 0xbef60080,
- 0xbef700ff, 0x10807fac,
+ 0x00000200, 0xbf0a7b7d,
+ 0xbfa2ffe9, 0xbfa0001a,
+ 0xbefd0084, 0xbf0a7b7d,
+ 0xbfa10017, 0x7e008700,
+ 0x7e028701, 0x7e048702,
+ 0x7e068703, 0x80767074,
+ 0x82778075, 0xee0a4076,
+ 0x000c0000, 0x00000000,
+ 0xee0a4076, 0x008c0000,
+ 0x00010000, 0xee0a4076,
+ 0x010c0000, 0x00020000,
+ 0xee0a4076, 0x018c0000,
+ 0x00030000, 0x807d847d,
+ 0x8070ff70, 0x00000400,
+ 0xbf0a7b7d, 0xbfa2ffe9,
+ 0xbfa0014c, 0xbef4007e,
+ 0x8b75ff7f, 0x0000ffff,
0xbef1007f, 0xb8f20742,
0x84729972, 0x8b6eff7f,
- 0x04000000, 0xbfa1003b,
- 0xbefe00c1, 0x857d9972,
- 0x8b7d817d, 0xbf06817d,
- 0xbfa20002, 0xbeff0080,
- 0xbfa00001, 0xbeff00c1,
- 0xb8ef4306, 0x8b6fc16f,
- 0xbfa10030, 0x846f896f,
- 0xbef6006f, 0xb8f83b05,
- 0x80788178, 0xbf0d9972,
- 0xbfa20002, 0x84788978,
- 0xbfa00001, 0x84788a78,
- 0xb8ee1e06, 0x846e8a6e,
- 0x80786e78, 0x8078ff78,
- 0x00000200, 0x8078ff78,
- 0x00000080, 0xbef600ff,
- 0x01000000, 0x857d9972,
- 0x8b7d817d, 0xbf06817d,
- 0xbefd0080, 0xbfa2000d,
- 0xc4050078, 0x0080e800,
- 0x00000000, 0xbf8a0000,
- 0xdac00000, 0x00000000,
- 0x807dff7d, 0x00000080,
- 0x8078ff78, 0x00000080,
- 0xbf0a6f7d, 0xbfa2fff4,
- 0xbfa0000c, 0xc4050078,
- 0x0080e800, 0x00000000,
- 0xbf8a0000, 0xdac00000,
- 0x00000000, 0x807dff7d,
- 0x00000100, 0x8078ff78,
- 0x00000100, 0xbf0a6f7d,
- 0xbfa2fff4, 0xbef80080,
+ 0x04000000, 0xbfa10044,
0xbefe00c1, 0x857d9972,
0x8b7d817d, 0xbf06817d,
- 0xbfa20002, 0xbeff0080,
- 0xbfa00001, 0xbeff00c1,
- 0xb8ef3b05, 0x806f816f,
- 0x846f826f, 0x857d9972,
- 0x8b7d817d, 0xbf06817d,
- 0xbfa2002c, 0xbef600ff,
- 0x01000000, 0xbeee0078,
- 0x8078ff78, 0x00000200,
- 0xbefd0084, 0xbf0a6f7d,
- 0xbfa10061, 0xc4050078,
- 0x008ce800, 0x00000000,
- 0xc4050078, 0x008ce801,
- 0x00008000, 0xc4050078,
- 0x008ce802, 0x00010000,
- 0xc4050078, 0x008ce803,
- 0x00018000, 0xbf8a0000,
- 0x7e008500, 0x7e028501,
- 0x7e048502, 0x7e068503,
- 0x807d847d, 0x8078ff78,
- 0x00000200, 0xbf0a6f7d,
- 0xbfa2ffea, 0xc405006e,
- 0x008ce800, 0x00000000,
- 0xc405006e, 0x008ce801,
- 0x00008000, 0xc405006e,
- 0x008ce802, 0x00010000,
- 0xc405006e, 0x008ce803,
- 0x00018000, 0xbf8a0000,
- 0xbfa0003d, 0xbef600ff,
- 0x01000000, 0xbeee0078,
- 0x8078ff78, 0x00000400,
- 0xbefd0084, 0xbf0a6f7d,
- 0xbfa10016, 0xc4050078,
- 0x008ce800, 0x00000000,
- 0xc4050078, 0x008ce801,
- 0x00010000, 0xc4050078,
- 0x008ce802, 0x00020000,
- 0xc4050078, 0x008ce803,
- 0x00030000, 0xbf8a0000,
- 0x7e008500, 0x7e028501,
- 0x7e048502, 0x7e068503,
- 0x807d847d, 0x8078ff78,
- 0x00000400, 0xbf0a6f7d,
- 0xbfa2ffea, 0xb8ef1e06,
- 0x8b6fc16f, 0xbfa1000f,
- 0x846f836f, 0x806f7d6f,
- 0xbefe00c1, 0xbeff0080,
- 0xc4050078, 0x008ce800,
- 0x00000000, 0xbf8a0000,
- 0x7e008500, 0x807d817d,
- 0x8078ff78, 0x00000080,
- 0xbf0a6f7d, 0xbfa2fff6,
- 0xbeff00c1, 0xc405006e,
- 0x008ce800, 0x00000000,
- 0xc405006e, 0x008ce801,
- 0x00010000, 0xc405006e,
- 0x008ce802, 0x00020000,
- 0xc405006e, 0x008ce803,
- 0x00030000, 0xbf8a0000,
+ 0xbfa20002, 0xbeff0080,
+ 0xbfa00001, 0xbeff00c1,
+ 0xb8ef4306, 0x8b6fc16f,
+ 0xbfa10039, 0x846f896f,
0xb8f83b05, 0x80788178,
0xbf0d9972, 0xbfa20002,
0x84788978, 0xbfa00001,
- 0x84788a78, 0xb8ee1e06,
- 0x846e8a6e, 0x80786e78,
+ 0x84788a78, 0x8078ff78,
+ 0x00000200, 0x8078ff78,
+ 0x00000080, 0x857d9972,
+ 0x8b7d817d, 0xbf06817d,
+ 0xbefd0080, 0xd71f0001,
+ 0x000100c1, 0xd7200001,
+ 0x000202c1, 0x30020282,
+ 0xbfa20012, 0x80767874,
+ 0x82778075, 0xee0a0076,
+ 0x000c0000, 0x00000000,
+ 0xbf8a0000, 0xd8340000,
+ 0x00000001, 0xd5250001,
+ 0x0001ff01, 0x00000080,
+ 0x807dff7d, 0x00000080,
+ 0x8078ff78, 0x00000080,
+ 0xbf0a6f7d, 0xbfa2ffef,
+ 0xbfa00011, 0x80767874,
+ 0x82778075, 0xee0a0076,
+ 0x000c0000, 0x00000000,
+ 0xbf8a0000, 0xd8340000,
+ 0x00000001, 0xd5250001,
+ 0x0001ff01, 0x00000100,
+ 0x807dff7d, 0x00000100,
+ 0x8078ff78, 0x00000100,
+ 0xbf0a6f7d, 0xbfa2ffef,
+ 0xbef80080, 0xbefe00c1,
+ 0x857d9972, 0x8b7d817d,
+ 0xbf06817d, 0xbfa20002,
+ 0xbeff0080, 0xbfa00001,
+ 0xbeff00c1, 0xb8ef3b05,
+ 0x806f816f, 0x846f826f,
+ 0x857d9972, 0x8b7d817d,
+ 0xbf06817d, 0xbfa2002c,
+ 0xbeee0078, 0x8078ff78,
+ 0x00000200, 0xbefd0084,
+ 0x80767874, 0x82778075,
+ 0xee0a0076, 0x000c0000,
+ 0x00000000, 0xee0a0076,
+ 0x000c0001, 0x00008000,
+ 0xee0a0076, 0x000c0002,
+ 0x00010000, 0xee0a0076,
+ 0x000c0003, 0x00018000,
+ 0xbf8a0000, 0x7e008500,
+ 0x7e028501, 0x7e048502,
+ 0x7e068503, 0x807d847d,
0x8078ff78, 0x00000200,
- 0x80f8ff78, 0x00000050,
- 0xbef600ff, 0x01000000,
+ 0xbf0a6f7d, 0xbfa2ffe8,
+ 0x80766e74, 0x82778075,
+ 0xee0a0076, 0x000c0000,
+ 0x00000000, 0xee0a0076,
+ 0x000c0001, 0x00008000,
+ 0xee0a0076, 0x000c0002,
+ 0x00010000, 0xee0a0076,
+ 0x000c0003, 0x00018000,
+ 0xbf8a0000, 0xbfa0002d,
+ 0xbeee0078, 0x8078ff78,
+ 0x00000400, 0xbefd0084,
+ 0xbf0a6f7d, 0xbfa10018,
+ 0x80767874, 0x82778075,
+ 0xee0a0076, 0x000c0000,
+ 0x00000000, 0xee0a0076,
+ 0x000c0001, 0x00010000,
+ 0xee0a0076, 0x000c0002,
+ 0x00020000, 0xee0a0076,
+ 0x000c0003, 0x00030000,
+ 0xbf8a0000, 0x7e008500,
+ 0x7e028501, 0x7e048502,
+ 0x7e068503, 0x807d847d,
+ 0x8078ff78, 0x00000400,
+ 0xbf0a6f7d, 0xbfa2ffe8,
+ 0x80766e74, 0x82778075,
+ 0xee0a0076, 0x000c0000,
+ 0x00000000, 0xee0a0076,
+ 0x000c0001, 0x00010000,
+ 0xee0a0076, 0x000c0002,
+ 0x00020000, 0xee0a0076,
+ 0x000c0003, 0x00030000,
+ 0xbf8a0000, 0xb8f83b05,
+ 0x80788178, 0xbf0d9972,
+ 0xbfa20002, 0x84788978,
+ 0xbfa00001, 0x84788a78,
+ 0x8078ff78, 0x00000200,
+ 0x80f8ff78, 0x00000060,
+ 0x80767874, 0x82778075,
0xbefd00ff, 0x0000006c,
- 0x80f89078, 0xf462403a,
- 0xf0000000, 0xbf8a0000,
- 0x80fd847d, 0xbf800000,
- 0xbe804300, 0xbe824302,
- 0x80f8a078, 0xf462603a,
- 0xf0000000, 0xbf8a0000,
+ 0xf460403b, 0xf8000000,
+ 0xbf8a0000, 0x80fd847d,
+ 0xbf800000, 0xbe804300,
+ 0xbe824302, 0x80f6a076,
+ 0x82f78077, 0xf460603b,
+ 0xf8000000, 0xbf8a0000,
0x80fd887d, 0xbf800000,
0xbe804300, 0xbe824302,
0xbe844304, 0xbe864306,
- 0x80f8c078, 0xf462803a,
- 0xf0000000, 0xbf8a0000,
- 0x80fd907d, 0xbf800000,
- 0xbe804300, 0xbe824302,
- 0xbe844304, 0xbe864306,
- 0xbe884308, 0xbe8a430a,
- 0xbe8c430c, 0xbe8e430e,
- 0xbf06807d, 0xbfa1fff0,
- 0xb980f801, 0x00000000,
- 0xb8f83b05, 0x80788178,
- 0xbf0d9972, 0xbfa20002,
- 0x84788978, 0xbfa00001,
- 0x84788a78, 0xb8ee1e06,
- 0x846e8a6e, 0x80786e78,
+ 0x80f6c076, 0x82f78077,
+ 0xf460803b, 0xf8000000,
+ 0xbf8a0000, 0x80fd907d,
+ 0xbf800000, 0xbe804300,
+ 0xbe824302, 0xbe844304,
+ 0xbe864306, 0xbe884308,
+ 0xbe8a430a, 0xbe8c430c,
+ 0xbe8e430e, 0xbf06807d,
+ 0xbfa1ffef, 0xb980f801,
+ 0x00000000, 0xb8f83b05,
+ 0x80788178, 0xbf0d9972,
+ 0xbfa20002, 0x84788978,
+ 0xbfa00001, 0x84788a78,
0x8078ff78, 0x00000200,
- 0xbef600ff, 0x01000000,
- 0xbeff0071, 0xf4621bfa,
- 0xf0000000, 0x80788478,
- 0xf4621b3a, 0xf0000000,
- 0x80788478, 0xf4621b7a,
- 0xf0000000, 0x80788478,
- 0xf4621c3a, 0xf0000000,
- 0x80788478, 0xf4621c7a,
- 0xf0000000, 0x80788478,
- 0xf4621eba, 0xf0000000,
- 0x80788478, 0xf4621efa,
- 0xf0000000, 0x80788478,
- 0xf4621e7a, 0xf0000000,
- 0x80788478, 0xf4621cfa,
- 0xf0000000, 0x80788478,
- 0xf4621bba, 0xf0000000,
- 0x80788478, 0xbf8a0000,
- 0xb96ef814, 0xf4621bba,
- 0xf0000000, 0x80788478,
- 0xbf8a0000, 0xb96ef815,
- 0xf4621bba, 0xf0000000,
- 0x80788478, 0xbf8a0000,
- 0xb96ef812, 0xf4621bba,
- 0xf0000000, 0x80788478,
- 0xbf8a0000, 0xb96ef813,
- 0x8b6eff7f, 0x04000000,
- 0xbfa1000d, 0x80788478,
- 0xf4621bba, 0xf0000000,
- 0x80788478, 0xbf8a0000,
- 0xbf0d806e, 0xbfa10006,
- 0x856e906e, 0x8b6e6e6e,
- 0xbfa10003, 0xbe804ec1,
- 0x816ec16e, 0xbfa0fffb,
- 0xbefd006f, 0xbefe0070,
- 0xbeff0071, 0xb97b2011,
- 0x857b867b, 0xb97b0191,
- 0x857b827b, 0xb97bba11,
- 0xb973f801, 0xb8ee3b05,
- 0x806e816e, 0xbf0d9972,
- 0xbfa20002, 0x846e896e,
- 0xbfa00001, 0x846e8a6e,
- 0xb8ef1e06, 0x846f8a6f,
- 0x806e6f6e, 0x806eff6e,
- 0x00000200, 0x806e746e,
- 0x826f8075, 0x8b6fff6f,
- 0x0000ffff, 0xf4605c37,
- 0xf8000050, 0xf4605d37,
- 0xf8000060, 0xf4601e77,
- 0xf8000074, 0xbf8a0000,
+ 0x80767874, 0x82778075,
+ 0xbeff0071, 0xf4601bfb,
+ 0xf8000000, 0xf4601b3b,
+ 0xf8000004, 0xf4601b7b,
+ 0xf8000008, 0xf4601c3b,
+ 0xf800000c, 0xf4601c7b,
+ 0xf8000010, 0xf4601ebb,
+ 0xf8000014, 0xf4601efb,
+ 0xf8000018, 0xf4601e7b,
+ 0xf800001c, 0xf4601cfb,
+ 0xf8000020, 0xf4601bbb,
+ 0xf8000024, 0xbf8a0000,
+ 0xb96ef814, 0xf4601bbb,
+ 0xf8000028, 0xbf8a0000,
+ 0xb96ef815, 0xf4601bbb,
+ 0xf800002c, 0xbf8a0000,
+ 0xb96ef812, 0xf4601bbb,
+ 0xf8000030, 0xbf8a0000,
+ 0xb96ef813, 0x8b6eff7f,
+ 0x04000000, 0xbfa1000b,
+ 0xf4601bbb, 0xf8000038,
+ 0xbf8a0000, 0xbf0d806e,
+ 0xbfa10006, 0x856e906e,
+ 0x8b6e6e6e, 0xbfa10003,
+ 0xbe804ec1, 0x816ec16e,
+ 0xbfa0fffb, 0xbefd006f,
+ 0xbefe0070, 0xbeff0071,
+ 0xb97b2011, 0x857b867b,
+ 0xb97b0191, 0x857b827b,
+ 0xb97bba11, 0xb973f801,
+ 0xb8ee3b05, 0x806e816e,
+ 0xbf0d9972, 0xbfa20002,
+ 0x846e896e, 0xbfa00001,
+ 0x846e8a6e, 0x806eff6e,
+ 0x00000240, 0x806e746e,
+ 0x826f8075, 0xf4605c37,
+ 0xf8000010, 0xf4605d37,
+ 0xf8000020, 0xf4601e77,
+ 0xf8000034, 0xbf8a0000,
0x8b6dff6d, 0x0000ffff,
0x8bfe7e7e, 0x8bea6a6a,
0x936eff77, 0x0002001a,
@@ -4666,14 +4587,18 @@ static const uint32_t cwsr_trap_gfx9_5_0_hex[] = {
};
static const uint32_t cwsr_trap_gfx12_1_0_hex[] = {
- 0xbfa00001, 0xbfa003b7,
- 0xb0804009, 0xb8f8f804,
+ 0xbfa00001, 0xbfa003aa,
+ 0xb0804009, 0xb8eef81a,
+ 0xbf880000, 0xb980081a,
+ 0x00000000, 0xb8f8f804,
+ 0x9177ff77, 0x0c000000,
+ 0x846e9a6e, 0x8c776e77,
0x9178ff78, 0x00008c00,
0xb8fbf811, 0x8b6eff78,
0x00004000, 0xbfa10008,
0x8b6eff7b, 0x00000080,
0xbfa20018, 0x8b6ea07b,
- 0xbfa200e1, 0xbf830010,
+ 0xbfa200d4, 0xbf830010,
0xb8fbf811, 0xbfa0fffb,
0x8b6eff7b, 0x00000bd0,
0xbfa20010, 0xb8eef812,
@@ -4684,7 +4609,7 @@ static const uint32_t cwsr_trap_gfx12_1_0_hex[] = {
0xf0000000, 0xbfa20005,
0x8b6fff6f, 0x00000200,
0xbfa20002, 0x8b6ea07b,
- 0xbfa200cb, 0x9177ff77,
+ 0xbfa200be, 0x9177ff77,
0x007fc000, 0xb8fa04a1,
0x847a967a, 0x8c777a77,
0xb8fa0421, 0x847a957a,
@@ -4777,263 +4702,230 @@ static const uint32_t cwsr_trap_gfx12_1_0_hex[] = {
0xb97a0421, 0x857a8e77,
0xb97a3021, 0x8bfe7e7e,
0x8bea6a6a, 0x85788978,
- 0xb9783244, 0xbe804a6c,
- 0xb8faf802, 0xbf0d987a,
- 0xbfa10001, 0xbfb00000,
- 0x8b6dff6d, 0x01ffffff,
- 0xbefa0080, 0xb97a0151,
- 0x9177ff77, 0x007fc000,
- 0xb8fa04a1, 0x847a967a,
- 0x8c777a77, 0xb8fa0421,
- 0x847a957a, 0x8c777a77,
- 0xb8fa3021, 0x847a8e7a,
- 0x8c777a77, 0xb980f821,
- 0x00000000, 0xbf0d847b,
- 0xbfa20078, 0xf4003eb6,
- 0xf8000000, 0xf4003bb6,
- 0xf8000008, 0xbfc70001,
- 0x8b76ff7a, 0x80000000,
- 0xbfa20027, 0x9376ff7a,
- 0x00060019, 0x81f9a376,
- 0xbf0b8179, 0xbfa20068,
- 0x81f9ac76, 0xbf0b8179,
- 0xbfa20062, 0x81f9b776,
- 0xbf0b8179, 0xbfa2005f,
- 0x8b76ff7a, 0x000001ff,
- 0xbf06ff76, 0x000000fe,
- 0xbfa2005d, 0xbf06ff76,
- 0x000000ff, 0xbfa20057,
- 0xbf06ff76, 0x000000fa,
- 0xbfa20054, 0x81f9ff76,
- 0x000000e9, 0xbf0b8179,
- 0xbfa20050, 0x8b76ff7b,
- 0xffff0000, 0xbf06ff76,
- 0xbf860000, 0xbfa10051,
- 0x9376ff7b, 0x0002000e,
- 0x8b79ff7b, 0x00003f00,
- 0x85798679, 0x8c767976,
- 0xb9763b01, 0xbfa00049,
- 0x8b76ff7a, 0xfc000000,
- 0xbf06ff76, 0xd4000000,
- 0xbfa20013, 0xbf06ff76,
- 0xc8000000, 0xbfa20027,
- 0x8b76ff7a, 0xff000000,
- 0xbf06ff76, 0xcf000000,
- 0xbfa20039, 0x8b79ff7a,
- 0xffff0000, 0xbf06ff79,
- 0xcc350000, 0xbfa20037,
- 0xbf06ff79, 0xcc3a0000,
- 0xbfa20034, 0xbf06ff76,
- 0xcc000000, 0xbfa10031,
- 0x8b76ff7b, 0x000001ff,
+ 0x936eff77, 0x0002001a,
+ 0xb96ef81a, 0xb9783244,
+ 0xbe804a6c, 0xb8faf802,
+ 0xbf0d987a, 0xbfa10001,
+ 0xbfb00000, 0x8b6dff6d,
+ 0x01ffffff, 0xbefa0080,
+ 0xb97a0151, 0x9177ff77,
+ 0x007fc000, 0xb8fa04a1,
+ 0x847a967a, 0x8c777a77,
+ 0xb8fa0421, 0x847a957a,
+ 0x8c777a77, 0xb8fa3021,
+ 0x847a8e7a, 0x8c777a77,
+ 0xb980f821, 0x00000000,
+ 0xbf0d847b, 0xbfa20078,
+ 0xf4003eb6, 0xf8000000,
+ 0xf4003bb6, 0xf8000008,
+ 0xbfc70001, 0x8b76ff7a,
+ 0x80000000, 0xbfa20027,
+ 0x9376ff7a, 0x00060019,
+ 0x81f9a376, 0xbf0b8179,
+ 0xbfa20068, 0x81f9ac76,
+ 0xbf0b8179, 0xbfa20062,
+ 0x81f9b776, 0xbf0b8179,
+ 0xbfa2005f, 0x8b76ff7a,
+ 0x000001ff, 0xbf06ff76,
+ 0x000000fe, 0xbfa2005d,
0xbf06ff76, 0x000000ff,
- 0xbfa20029, 0xbf06ff76,
- 0x000000fa, 0xbfa20026,
- 0x81f6ff76, 0x000000e9,
- 0xbf0b8176, 0xbfa20022,
- 0x8b76ff7b, 0x0003fe00,
- 0xbf06ff76, 0x0001fe00,
- 0xbfa2001d, 0x8b76ff7b,
- 0x07fc0000, 0xbf06ff76,
- 0x03fc0000, 0xbfa20018,
- 0xbfa00014, 0x9376ff7a,
- 0x00040016, 0x81f68176,
- 0xbf0b8176, 0xbfa20012,
- 0x9376ff7a, 0x00050011,
- 0x81f68176, 0xbf0b8176,
- 0xbfa2000d, 0x8b76ff7a,
+ 0xbfa20057, 0xbf06ff76,
+ 0x000000fa, 0xbfa20054,
+ 0x81f9ff76, 0x000000e9,
+ 0xbf0b8179, 0xbfa20050,
+ 0x8b76ff7b, 0xffff0000,
+ 0xbf06ff76, 0xbf860000,
+ 0xbfa10051, 0x9376ff7b,
+ 0x0002000e, 0x8b79ff7b,
+ 0x00003f00, 0x85798679,
+ 0x8c767976, 0xb9763b01,
+ 0xbfa00049, 0x8b76ff7a,
+ 0xfc000000, 0xbf06ff76,
+ 0xd4000000, 0xbfa20013,
+ 0xbf06ff76, 0xc8000000,
+ 0xbfa20027, 0x8b76ff7a,
+ 0xff000000, 0xbf06ff76,
+ 0xcf000000, 0xbfa20039,
+ 0x8b79ff7a, 0xffff0000,
+ 0xbf06ff79, 0xcc350000,
+ 0xbfa20037, 0xbf06ff79,
+ 0xcc3a0000, 0xbfa20034,
+ 0xbf06ff76, 0xcc000000,
+ 0xbfa10031, 0x8b76ff7b,
0x000001ff, 0xbf06ff76,
- 0x000000ff, 0xbfa20008,
- 0x8b76ff7b, 0x000001ff,
+ 0x000000ff, 0xbfa20029,
+ 0xbf06ff76, 0x000000fa,
+ 0xbfa20026, 0x81f6ff76,
+ 0x000000e9, 0xbf0b8176,
+ 0xbfa20022, 0x8b76ff7b,
+ 0x0003fe00, 0xbf06ff76,
+ 0x0001fe00, 0xbfa2001d,
+ 0x8b76ff7b, 0x07fc0000,
+ 0xbf06ff76, 0x03fc0000,
+ 0xbfa20018, 0xbfa00014,
+ 0x9376ff7a, 0x00040016,
+ 0x81f68176, 0xbf0b8176,
+ 0xbfa20012, 0x9376ff7a,
+ 0x00050011, 0x81f68176,
+ 0xbf0b8176, 0xbfa2000d,
+ 0x8b76ff7a, 0x000001ff,
0xbf06ff76, 0x000000ff,
- 0xbfa20003, 0xbfc70000,
- 0xbefb006e, 0xbfa0ffad,
- 0xbfc70000, 0xbefb006f,
- 0xbfa0ffaa, 0xbfc70000,
- 0xbeee007e, 0xbeef007f,
- 0xbefe0180, 0xbefe4d84,
- 0xbf8a0000, 0x8b7aff7f,
- 0x04000000, 0x847a857a,
- 0x8c6d7a6d, 0xb8eff822,
- 0xb980f822, 0x00000000,
- 0xb8fa2b01, 0x847a997a,
- 0x8c6d7a6d, 0xbefa0080,
- 0xb97a2b01, 0xbefa007e,
+ 0xbfa20008, 0x8b76ff7b,
+ 0x000001ff, 0xbf06ff76,
+ 0x000000ff, 0xbfa20003,
+ 0xbfc70000, 0xbefb006e,
+ 0xbfa0ffad, 0xbfc70000,
+ 0xbefb006f, 0xbfa0ffaa,
+ 0xbfc70000, 0xbeee007e,
+ 0xbeef007f, 0xbefe0180,
+ 0xbefe4d84, 0xbf8a0000,
+ 0x8b7aff7f, 0x04000000,
+ 0x847a857a, 0x8c6d7a6d,
+ 0xb8eff822, 0xb980f822,
+ 0x00000000, 0xb8fa2b01,
+ 0x847a997a, 0x8c6d7a6d,
+ 0xbefa0080, 0xb97a2b01,
+ 0xbefa007e, 0x8b7bff7f,
+ 0x01ffffff, 0xbefe00c1,
+ 0xbeff00c1, 0xee0a407a,
+ 0x000c0000, 0x00000000,
+ 0x7e000280, 0xbefe007a,
+ 0xbeff007b, 0xb8fb0742,
+ 0x847b997b, 0xb8fa3b05,
+ 0x807a817a, 0xbf0d997b,
+ 0xbfa20002, 0x847a897a,
+ 0xbfa00001, 0x847a8a7a,
0x8b7bff7f, 0x01ffffff,
- 0xbefe00c1, 0xbeff00c1,
- 0xee0a407a, 0x000c0000,
- 0x00000000, 0x7e000280,
+ 0x807aff7a, 0x000001c0,
+ 0x807a7e7a, 0x827b807b,
+ 0xd7610000, 0x00010870,
+ 0xd7610000, 0x00010a71,
+ 0xd7610000, 0x00010c72,
+ 0xd7610000, 0x00010e73,
+ 0xd7610000, 0x00011074,
+ 0xd7610000, 0x00011275,
+ 0xd7610000, 0x00011476,
+ 0xd7610000, 0x00011677,
+ 0xd7610000, 0x00011a79,
+ 0xd7610000, 0x00011c7e,
+ 0xd7610000, 0x00011e7f,
+ 0xbefe00ff, 0x00003fff,
+ 0xbeff0080, 0xee0a407a,
+ 0x000c0000, 0x00000000,
+ 0xd760007a, 0x00011d00,
+ 0xd760007b, 0x00011f00,
0xbefe007a, 0xbeff007b,
- 0xb8fb0742, 0x847b997b,
- 0xb8fa3b05, 0x807a817a,
- 0xbf0d997b, 0xbfa20002,
- 0x847a897a, 0xbfa00001,
- 0x847a8a7a, 0x8b7bff7f,
- 0x01ffffff, 0x807aff7a,
- 0x000001c0, 0x807a7e7a,
- 0x827b807b, 0xd7610000,
- 0x00010870, 0xd7610000,
- 0x00010a71, 0xd7610000,
- 0x00010c72, 0xd7610000,
- 0x00010e73, 0xd7610000,
- 0x00011074, 0xd7610000,
- 0x00011275, 0xd7610000,
- 0x00011476, 0xd7610000,
- 0x00011677, 0xd7610000,
- 0x00011a79, 0xd7610000,
- 0x00011c7e, 0xd7610000,
- 0x00011e7f, 0xbefe00ff,
- 0x00003fff, 0xbeff0080,
- 0xee0a407a, 0x000c0000,
- 0x00000000, 0xd760007a,
- 0x00011d00, 0xd760007b,
- 0x00011f00, 0xbefe007a,
- 0xbeff007b, 0xbef4007e,
- 0x8b75ff7f, 0x01ffffff,
- 0xbef1007d, 0xb8f30742,
- 0x84739973, 0xbefe00c1,
- 0x857d9973, 0x8b7d817d,
- 0xbf06817d, 0xbfa20002,
- 0xbeff0080, 0xbfa00002,
- 0xbeff00c1, 0xbfa0000a,
- 0xee0a4074, 0x008c0000,
- 0x00008000, 0xee0a4074,
- 0x010c0000, 0x00010000,
- 0xee0a4074, 0x018c0000,
- 0x00018000, 0xbfa00009,
- 0xee0a4074, 0x008c0000,
+ 0xbef4007e, 0x8b75ff7f,
+ 0x01ffffff, 0xbef1007d,
+ 0xb8f30742, 0x84739973,
+ 0xbefe00c1, 0x857d9973,
+ 0x8b7d817d, 0xbf06817d,
+ 0xbfa20002, 0xbeff0080,
+ 0xbfa00002, 0xbeff00c1,
+ 0xbfa0000a, 0xee0a4074,
+ 0x008c0000, 0x00008000,
+ 0xee0a4074, 0x010c0000,
0x00010000, 0xee0a4074,
- 0x010c0000, 0x00020000,
- 0xee0a4074, 0x018c0000,
- 0x00030000, 0xb8f03b05,
- 0x80708170, 0xbf0d9973,
- 0xbfa20002, 0x84708970,
- 0xbfa00001, 0x84708a70,
- 0x8070ff70, 0x00000200,
- 0x7e000280, 0x7e020280,
- 0x7e040280, 0xbefd0080,
- 0xb8faf802, 0xbf0c8b7a,
- 0xbfa20003, 0xbe804fc2,
- 0xbf94fffe, 0xbfa10001,
- 0xbe804ec4, 0xbf94fffc,
- 0xb8faf804, 0x8b7aff7a,
- 0x0001000c, 0x9178ff78,
- 0x0001000c, 0x8c787a78,
- 0xd7610002, 0x0000fa71,
- 0x807d817d, 0xd7610002,
- 0x0000fa6c, 0x807d817d,
- 0x917aff6d, 0x80000000,
- 0xd7610002, 0x0000fa7a,
- 0x807d817d, 0xd7610002,
- 0x0000fa6e, 0x807d817d,
- 0xbefa0080, 0xd7610002,
+ 0x018c0000, 0x00018000,
+ 0xbfa00009, 0xee0a4074,
+ 0x008c0000, 0x00010000,
+ 0xee0a4074, 0x010c0000,
+ 0x00020000, 0xee0a4074,
+ 0x018c0000, 0x00030000,
+ 0xb8f03b05, 0x80708170,
+ 0xbf0d9973, 0xbfa20002,
+ 0x84708970, 0xbfa00001,
+ 0x84708a70, 0x8070ff70,
+ 0x00000200, 0x7e000280,
+ 0x7e020280, 0x7e040280,
+ 0xbefd0080, 0xb8faf802,
+ 0xbf0c8b7a, 0xbfa20003,
+ 0xbe804fc2, 0xbf94fffe,
+ 0xbfa10001, 0xbe804ec4,
+ 0xbf94fffc, 0xb8faf804,
+ 0x8b7aff7a, 0x0001000c,
+ 0x9178ff78, 0x0001000c,
+ 0x8c787a78, 0xd7610002,
+ 0x0000fa71, 0x807d817d,
+ 0xd7610002, 0x0000fa6c,
+ 0x807d817d, 0x917aff6d,
+ 0x80000000, 0xd7610002,
0x0000fa7a, 0x807d817d,
- 0xd7610002, 0x0000fa78,
- 0x807d817d, 0xb8faf811,
+ 0xd7610002, 0x0000fa6e,
+ 0x807d817d, 0xbefa0080,
0xd7610002, 0x0000fa7a,
0x807d817d, 0xd7610002,
- 0x0000fa6f, 0x807d817d,
- 0xb8f1f801, 0x937aff6d,
- 0x00060019, 0x847a8c7a,
- 0x8c717a71, 0xd7610002,
- 0x0000fa71, 0x807d817d,
- 0xb8f1f814, 0xd7610002,
- 0x0000fa71, 0x807d817d,
- 0xb8f1f815, 0xd7610002,
- 0x0000fa71, 0x807d817d,
- 0xb8f1f812, 0xd7610002,
- 0x0000fa71, 0x807d817d,
- 0xb8f1f813, 0xd7610002,
- 0x0000fa71, 0x807d817d,
- 0xb8faf802, 0xd7610002,
+ 0x0000fa78, 0x807d817d,
+ 0xb8faf811, 0xd7610002,
0x0000fa7a, 0x807d817d,
- 0xbefa50c1, 0xbfc70000,
+ 0xd7610002, 0x0000fa6f,
+ 0x807d817d, 0xb8f1f801,
+ 0x937aff6d, 0x00060019,
+ 0x847a8c7a, 0x8c717a71,
+ 0xd7610002, 0x0000fa71,
+ 0x807d817d, 0xb8f1f814,
+ 0xd7610002, 0x0000fa71,
+ 0x807d817d, 0xb8f1f815,
+ 0xd7610002, 0x0000fa71,
+ 0x807d817d, 0xb8f1f812,
+ 0xd7610002, 0x0000fa71,
+ 0x807d817d, 0xb8f1f813,
+ 0xd7610002, 0x0000fa71,
+ 0x807d817d, 0xb8faf802,
0xd7610002, 0x0000fa7a,
- 0x807d817d, 0xbefa4c88,
+ 0x807d817d, 0xbefa50c1,
0xbfc70000, 0xd7610002,
0x0000fa7a, 0x807d817d,
- 0xbefe00ff, 0x0000ffff,
- 0xbeff0080, 0x80767074,
- 0x82778075, 0xee0a4076,
- 0x010c0000, 0x00000000,
- 0xbefe00c1, 0x7e040280,
- 0xbefa5081, 0xbfc70000,
- 0xd7610002, 0x0001007a,
- 0xbefa5082, 0xbfc70000,
- 0xd7610002, 0x0001027a,
- 0xbefa5083, 0xbfc70000,
- 0xd7610002, 0x0001047a,
- 0xbefa5084, 0xbfc70000,
- 0xd7610002, 0x0001067a,
- 0xbefa5085, 0xbfc70000,
- 0xd7610002, 0x0001087a,
- 0xbefa5086, 0xbfc70000,
- 0xd7610002, 0x00010a7a,
- 0xbefa5087, 0xbfc70000,
- 0xd7610002, 0x00010c7a,
- 0xbefa5088, 0xbfc70000,
- 0xd7610002, 0x00010e7a,
- 0xbefa5089, 0xbfc70000,
- 0xd7610002, 0x0001107a,
- 0xbefa508a, 0xbfc70000,
- 0xd7610002, 0x0001127a,
- 0xbefa508b, 0xbfc70000,
- 0xd7610002, 0x0001147a,
- 0xbefa508c, 0xbfc70000,
- 0xd7610002, 0x0001167a,
- 0xbefa508d, 0xbfc70000,
- 0xd7610002, 0x0001187a,
- 0xbefa508e, 0xbfc70000,
- 0xd7610002, 0x00011a7a,
- 0xbefa508f, 0xbfc70000,
- 0xd7610002, 0x00011c7a,
- 0xbefa5090, 0xbfc70000,
- 0xd7610002, 0x00011e7a,
+ 0xbefa4c88, 0xbfc70000,
+ 0xd7610002, 0x0000fa7a,
+ 0x807d817d, 0xbefe00ff,
+ 0x0000ffff, 0xbeff0080,
+ 0x80767074, 0x82778075,
0xee0a4076, 0x010c0000,
- 0x00008000, 0xb8f03b05,
- 0x80708170, 0xbf0d9973,
- 0xbfa20002, 0x84708970,
- 0xbfa00001, 0x84708a70,
- 0xbef90080, 0xbefd0080,
- 0xbf800000, 0xbe804100,
- 0xbe824102, 0xbe844104,
- 0xbe864106, 0xbe884108,
- 0xbe8a410a, 0xbe8c410c,
- 0xbe8e410e, 0xd7610002,
- 0x0000f200, 0x80798179,
- 0xd7610002, 0x0000f201,
- 0x80798179, 0xd7610002,
- 0x0000f202, 0x80798179,
- 0xd7610002, 0x0000f203,
- 0x80798179, 0xd7610002,
- 0x0000f204, 0x80798179,
- 0xd7610002, 0x0000f205,
- 0x80798179, 0xd7610002,
- 0x0000f206, 0x80798179,
- 0xd7610002, 0x0000f207,
- 0x80798179, 0xd7610002,
- 0x0000f208, 0x80798179,
- 0xd7610002, 0x0000f209,
- 0x80798179, 0xd7610002,
- 0x0000f20a, 0x80798179,
- 0xd7610002, 0x0000f20b,
- 0x80798179, 0xd7610002,
- 0x0000f20c, 0x80798179,
- 0xd7610002, 0x0000f20d,
- 0x80798179, 0xd7610002,
- 0x0000f20e, 0x80798179,
- 0xd7610002, 0x0000f20f,
- 0x80798179, 0xbf06a079,
- 0xbfa10009, 0x80767074,
- 0x82778075, 0xee0a4076,
- 0x010c0000, 0x00000000,
- 0x8070ff70, 0x00000080,
- 0xbef90080, 0x7e040280,
- 0x807d907d, 0xbf0aff7d,
- 0x00000060, 0xbfa2ffb9,
+ 0x00000000, 0xbefe00c1,
+ 0x7e040280, 0xbefa5081,
+ 0xbfc70000, 0xd7610002,
+ 0x0001007a, 0xbefa5082,
+ 0xbfc70000, 0xd7610002,
+ 0x0001027a, 0xbefa5083,
+ 0xbfc70000, 0xd7610002,
+ 0x0001047a, 0xbefa5084,
+ 0xbfc70000, 0xd7610002,
+ 0x0001067a, 0xbefa5085,
+ 0xbfc70000, 0xd7610002,
+ 0x0001087a, 0xbefa5086,
+ 0xbfc70000, 0xd7610002,
+ 0x00010a7a, 0xbefa5087,
+ 0xbfc70000, 0xd7610002,
+ 0x00010c7a, 0xbefa5088,
+ 0xbfc70000, 0xd7610002,
+ 0x00010e7a, 0xbefa5089,
+ 0xbfc70000, 0xd7610002,
+ 0x0001107a, 0xbefa508a,
+ 0xbfc70000, 0xd7610002,
+ 0x0001127a, 0xbefa508b,
+ 0xbfc70000, 0xd7610002,
+ 0x0001147a, 0xbefa508c,
+ 0xbfc70000, 0xd7610002,
+ 0x0001167a, 0xbefa508d,
+ 0xbfc70000, 0xd7610002,
+ 0x0001187a, 0xbefa508e,
+ 0xbfc70000, 0xd7610002,
+ 0x00011a7a, 0xbefa508f,
+ 0xbfc70000, 0xd7610002,
+ 0x00011c7a, 0xbefa5090,
+ 0xbfc70000, 0xd7610002,
+ 0x00011e7a, 0xee0a4076,
+ 0x010c0000, 0x00008000,
+ 0xb8f03b05, 0x80708170,
+ 0xbf0d9973, 0xbfa20002,
+ 0x84708970, 0xbfa00001,
+ 0x84708a70, 0xbef90080,
+ 0xbefd0080, 0xbf800000,
0xbe804100, 0xbe824102,
0xbe844104, 0xbe864106,
0xbe884108, 0xbe8a410a,
+ 0xbe8c410c, 0xbe8e410e,
0xd7610002, 0x0000f200,
0x80798179, 0xd7610002,
0x0000f201, 0x80798179,
@@ -5052,271 +4944,307 @@ static const uint32_t cwsr_trap_gfx12_1_0_hex[] = {
0xd7610002, 0x0000f20a,
0x80798179, 0xd7610002,
0x0000f20b, 0x80798179,
- 0xbefe00ff, 0x0000ffff,
+ 0xd7610002, 0x0000f20c,
+ 0x80798179, 0xd7610002,
+ 0x0000f20d, 0x80798179,
+ 0xd7610002, 0x0000f20e,
+ 0x80798179, 0xd7610002,
+ 0x0000f20f, 0x80798179,
+ 0xbf06a079, 0xbfa10009,
0x80767074, 0x82778075,
0xee0a4076, 0x010c0000,
- 0x00000000, 0xbefe00c1,
- 0x857d9973, 0x8b7d817d,
- 0xbf06817d, 0xbfa20002,
- 0xbeff0080, 0xbfa00001,
- 0xbeff00c1, 0xb8fb4306,
- 0x8b7bc17b, 0xbfa10042,
- 0x8b7aff6d, 0x80000000,
- 0xbfa1003f, 0x847b8a7b,
- 0xb8f03b05, 0x80708170,
- 0xbf0d9973, 0xbfa20002,
- 0x84708970, 0xbfa00001,
- 0x84708a70, 0x8070ff70,
- 0x00000200, 0x8070ff70,
- 0x00000200, 0xd71f0000,
- 0x000100c1, 0xd7200000,
- 0x000200c1, 0x16000084,
- 0x857d9973, 0x8b7d817d,
- 0xbf06817d, 0xbefd0080,
- 0xbfa20015, 0xbe8300ff,
- 0x00000080, 0xbf800000,
- 0xbf800000, 0xbf800000,
- 0xd8d80000, 0x01000000,
- 0xbf8a0000, 0x80767074,
+ 0x00000000, 0x8070ff70,
+ 0x00000080, 0xbef90080,
+ 0x7e040280, 0x807d907d,
+ 0xbf0aff7d, 0x00000060,
+ 0xbfa2ffb9, 0xbe804100,
+ 0xbe824102, 0xbe844104,
+ 0xbe864106, 0xbe884108,
+ 0xbe8a410a, 0xd7610002,
+ 0x0000f200, 0x80798179,
+ 0xd7610002, 0x0000f201,
+ 0x80798179, 0xd7610002,
+ 0x0000f202, 0x80798179,
+ 0xd7610002, 0x0000f203,
+ 0x80798179, 0xd7610002,
+ 0x0000f204, 0x80798179,
+ 0xd7610002, 0x0000f205,
+ 0x80798179, 0xd7610002,
+ 0x0000f206, 0x80798179,
+ 0xd7610002, 0x0000f207,
+ 0x80798179, 0xd7610002,
+ 0x0000f208, 0x80798179,
+ 0xd7610002, 0x0000f209,
+ 0x80798179, 0xd7610002,
+ 0x0000f20a, 0x80798179,
+ 0xd7610002, 0x0000f20b,
+ 0x80798179, 0xbefe00ff,
+ 0x0000ffff, 0x80767074,
0x82778075, 0xee0a4076,
- 0x008c0000, 0x00000000,
- 0x807d037d, 0x80700370,
- 0xd5250000, 0x0001ff00,
- 0x00000080, 0xbf0a7b7d,
- 0xbfa2fff1, 0xbfa00014,
- 0xbe8300ff, 0x00000100,
- 0xbf800000, 0xbf800000,
- 0xbf800000, 0xd8d80000,
- 0x01000000, 0xbf8a0000,
- 0x80767074, 0x82778075,
- 0xee0a4076, 0x008c0000,
- 0x00000000, 0x807d037d,
- 0x80700370, 0xd5250000,
- 0x0001ff00, 0x00000100,
- 0xbf0a7b7d, 0xbfa2fff1,
+ 0x010c0000, 0x00000000,
0xbefe00c1, 0x857d9973,
0x8b7d817d, 0xbf06817d,
- 0xbfa20004, 0xbef000ff,
- 0x00000200, 0xbeff0080,
- 0xbfa00003, 0xbef000ff,
- 0x00000400, 0xbeff00c1,
- 0xb8fb3b05, 0x807b817b,
- 0x847b827b, 0x857d9973,
+ 0xbfa20002, 0xbeff0080,
+ 0xbfa00001, 0xbeff00c1,
+ 0xb8fb4306, 0x8b7bc17b,
+ 0xbfa10042, 0x8b7aff6d,
+ 0x80000000, 0xbfa1003f,
+ 0x847b8a7b, 0xb8f03b05,
+ 0x80708170, 0xbf0d9973,
+ 0xbfa20002, 0x84708970,
+ 0xbfa00001, 0x84708a70,
+ 0x8070ff70, 0x00000200,
+ 0x8070ff70, 0x00000200,
+ 0xd71f0000, 0x000100c1,
+ 0xd7200000, 0x000200c1,
+ 0x16000084, 0x857d9973,
0x8b7d817d, 0xbf06817d,
- 0xbfa2001b, 0xbefd0084,
- 0xbf0a7b7d, 0xbfa10032,
- 0x7e008700, 0x7e028701,
- 0x7e048702, 0x7e068703,
- 0x80767074, 0x82778075,
- 0xee0a4076, 0x000c0000,
- 0x00000000, 0xee0a4076,
- 0x008c0000, 0x00008000,
- 0xee0a4076, 0x010c0000,
- 0x00010000, 0xee0a4076,
- 0x018c0000, 0x00018000,
- 0x807d847d, 0x8070ff70,
- 0x00000200, 0xbf0a7b7d,
- 0xbfa2ffe9, 0xbfa0001a,
+ 0xbefd0080, 0xbfa20015,
+ 0xbe8300ff, 0x00000080,
+ 0xbf800000, 0xbf800000,
+ 0xbf800000, 0xd8d80000,
+ 0x01000000, 0xbf8a0000,
+ 0x80767074, 0x82778075,
+ 0xee0a4076, 0x008c0000,
+ 0x00000000, 0x807d037d,
+ 0x80700370, 0xd5250000,
+ 0x0001ff00, 0x00000080,
+ 0xbf0a7b7d, 0xbfa2fff1,
+ 0xbfa00014, 0xbe8300ff,
+ 0x00000100, 0xbf800000,
+ 0xbf800000, 0xbf800000,
+ 0xd8d80000, 0x01000000,
+ 0xbf8a0000, 0x80767074,
+ 0x82778075, 0xee0a4076,
+ 0x008c0000, 0x00000000,
+ 0x807d037d, 0x80700370,
+ 0xd5250000, 0x0001ff00,
+ 0x00000100, 0xbf0a7b7d,
+ 0xbfa2fff1, 0xbefe00c1,
+ 0x857d9973, 0x8b7d817d,
+ 0xbf06817d, 0xbfa20004,
+ 0xbef000ff, 0x00000200,
+ 0xbeff0080, 0xbfa00003,
+ 0xbef000ff, 0x00000400,
+ 0xbeff00c1, 0xb8fb3b05,
+ 0x807b817b, 0x847b827b,
+ 0x857d9973, 0x8b7d817d,
+ 0xbf06817d, 0xbfa2001b,
0xbefd0084, 0xbf0a7b7d,
- 0xbfa10017, 0x7e008700,
+ 0xbfa10032, 0x7e008700,
0x7e028701, 0x7e048702,
0x7e068703, 0x80767074,
0x82778075, 0xee0a4076,
0x000c0000, 0x00000000,
0xee0a4076, 0x008c0000,
- 0x00010000, 0xee0a4076,
- 0x010c0000, 0x00020000,
+ 0x00008000, 0xee0a4076,
+ 0x010c0000, 0x00010000,
0xee0a4076, 0x018c0000,
- 0x00030000, 0x807d847d,
- 0x8070ff70, 0x00000400,
+ 0x00018000, 0x807d847d,
+ 0x8070ff70, 0x00000200,
0xbf0a7b7d, 0xbfa2ffe9,
- 0xbfa00180, 0xbef4007e,
- 0x8b75ff7f, 0x01ffffff,
- 0xbef1007f, 0xb8f20742,
- 0x84729972, 0x8b6eff7f,
- 0x04000000, 0xbfa10044,
+ 0xbfa0001a, 0xbefd0084,
+ 0xbf0a7b7d, 0xbfa10017,
+ 0x7e008700, 0x7e028701,
+ 0x7e048702, 0x7e068703,
+ 0x80767074, 0x82778075,
+ 0xee0a4076, 0x000c0000,
+ 0x00000000, 0xee0a4076,
+ 0x008c0000, 0x00010000,
+ 0xee0a4076, 0x010c0000,
+ 0x00020000, 0xee0a4076,
+ 0x018c0000, 0x00030000,
+ 0x807d847d, 0x8070ff70,
+ 0x00000400, 0xbf0a7b7d,
+ 0xbfa2ffe9, 0xbfa00183,
+ 0xbef4007e, 0x8b75ff7f,
+ 0x01ffffff, 0xbef1007f,
+ 0xb8f20742, 0x84729972,
+ 0x8b6eff7f, 0x04000000,
+ 0xbfa10044, 0xbefe00c1,
+ 0x857d9972, 0x8b7d817d,
+ 0xbf06817d, 0xbfa20002,
+ 0xbeff0080, 0xbfa00001,
+ 0xbeff00c1, 0xb8ef4306,
+ 0x8b6fc16f, 0xbfa10039,
+ 0x846f8a6f, 0xb8f83b05,
+ 0x80788178, 0xbf0d9972,
+ 0xbfa20002, 0x84788978,
+ 0xbfa00001, 0x84788a78,
+ 0x8078ff78, 0x00000200,
+ 0x8078ff78, 0x00000200,
+ 0x857d9972, 0x8b7d817d,
+ 0xbf06817d, 0xbefd0080,
+ 0xd71f0001, 0x000100c1,
+ 0xd7200001, 0x000202c1,
+ 0x30020282, 0xbfa20012,
+ 0x80767874, 0x82778075,
+ 0xee0a0076, 0x000c0000,
+ 0x00000000, 0xbf8a0000,
+ 0xd8340000, 0x00000001,
+ 0xd5250001, 0x0001ff01,
+ 0x00000080, 0x807dff7d,
+ 0x00000080, 0x8078ff78,
+ 0x00000080, 0xbf0a6f7d,
+ 0xbfa2ffef, 0xbfa00011,
+ 0x80767874, 0x82778075,
+ 0xee0a0076, 0x000c0000,
+ 0x00000000, 0xbf8a0000,
+ 0xd8340000, 0x00000001,
+ 0xd5250001, 0x0001ff01,
+ 0x00000100, 0x807dff7d,
+ 0x00000100, 0x8078ff78,
+ 0x00000100, 0xbf0a6f7d,
+ 0xbfa2ffef, 0xbef80080,
0xbefe00c1, 0x857d9972,
0x8b7d817d, 0xbf06817d,
0xbfa20002, 0xbeff0080,
0xbfa00001, 0xbeff00c1,
- 0xb8ef4306, 0x8b6fc16f,
- 0xbfa10039, 0x846f8a6f,
- 0xb8f83b05, 0x80788178,
- 0xbf0d9972, 0xbfa20002,
- 0x84788978, 0xbfa00001,
- 0x84788a78, 0x8078ff78,
- 0x00000200, 0x8078ff78,
- 0x00000200, 0x857d9972,
+ 0xb8ef3b05, 0x806f816f,
+ 0x846f826f, 0x857d9972,
0x8b7d817d, 0xbf06817d,
- 0xbefd0080, 0xd71f0001,
- 0x000100c1, 0xd7200001,
- 0x000202c1, 0x30020282,
- 0xbfa20012, 0x80767874,
+ 0xbfa2002c, 0xbeee0078,
+ 0x8078ff78, 0x00000200,
+ 0xbefd0084, 0x80767874,
0x82778075, 0xee0a0076,
0x000c0000, 0x00000000,
- 0xbf8a0000, 0xd8340000,
- 0x00000001, 0xd5250001,
- 0x0001ff01, 0x00000080,
- 0x807dff7d, 0x00000080,
- 0x8078ff78, 0x00000080,
- 0xbf0a6f7d, 0xbfa2ffef,
- 0xbfa00011, 0x80767874,
+ 0xee0a0076, 0x000c0001,
+ 0x00008000, 0xee0a0076,
+ 0x000c0002, 0x00010000,
+ 0xee0a0076, 0x000c0003,
+ 0x00018000, 0xbf8a0000,
+ 0x7e008500, 0x7e028501,
+ 0x7e048502, 0x7e068503,
+ 0x807d847d, 0x8078ff78,
+ 0x00000200, 0xbf0a6f7d,
+ 0xbfa2ffe8, 0x80766e74,
0x82778075, 0xee0a0076,
0x000c0000, 0x00000000,
- 0xbf8a0000, 0xd8340000,
- 0x00000001, 0xd5250001,
- 0x0001ff01, 0x00000100,
- 0x807dff7d, 0x00000100,
- 0x8078ff78, 0x00000100,
- 0xbf0a6f7d, 0xbfa2ffef,
- 0xbef80080, 0xbefe00c1,
- 0x857d9972, 0x8b7d817d,
- 0xbf06817d, 0xbfa20002,
- 0xbeff0080, 0xbfa00001,
- 0xbeff00c1, 0xb8ef3b05,
- 0x806f816f, 0x846f826f,
- 0x857d9972, 0x8b7d817d,
- 0xbf06817d, 0xbfa2002c,
- 0xbeee0078, 0x8078ff78,
- 0x00000200, 0xbefd0084,
- 0x80767874, 0x82778075,
- 0xee0a0076, 0x000c0000,
- 0x00000000, 0xee0a0076,
- 0x000c0001, 0x00008000,
- 0xee0a0076, 0x000c0002,
+ 0xee0a0076, 0x000c0001,
+ 0x00008000, 0xee0a0076,
+ 0x000c0002, 0x00010000,
+ 0xee0a0076, 0x000c0003,
+ 0x00018000, 0xbf8a0000,
+ 0xbfa0002d, 0xbeee0078,
+ 0x8078ff78, 0x00000400,
+ 0xbefd0084, 0xbf0a6f7d,
+ 0xbfa10018, 0x80767874,
+ 0x82778075, 0xee0a0076,
+ 0x000c0000, 0x00000000,
+ 0xee0a0076, 0x000c0001,
0x00010000, 0xee0a0076,
- 0x000c0003, 0x00018000,
- 0xbf8a0000, 0x7e008500,
- 0x7e028501, 0x7e048502,
- 0x7e068503, 0x807d847d,
- 0x8078ff78, 0x00000200,
- 0xbf0a6f7d, 0xbfa2ffe8,
- 0x80766e74, 0x82778075,
- 0xee0a0076, 0x000c0000,
- 0x00000000, 0xee0a0076,
- 0x000c0001, 0x00008000,
- 0xee0a0076, 0x000c0002,
+ 0x000c0002, 0x00020000,
+ 0xee0a0076, 0x000c0003,
+ 0x00030000, 0xbf8a0000,
+ 0x7e008500, 0x7e028501,
+ 0x7e048502, 0x7e068503,
+ 0x807d847d, 0x8078ff78,
+ 0x00000400, 0xbf0a6f7d,
+ 0xbfa2ffe8, 0x80766e74,
+ 0x82778075, 0xee0a0076,
+ 0x000c0000, 0x00000000,
+ 0xee0a0076, 0x000c0001,
0x00010000, 0xee0a0076,
- 0x000c0003, 0x00018000,
- 0xbf8a0000, 0xbfa0002d,
- 0xbeee0078, 0x8078ff78,
- 0x00000400, 0xbefd0084,
- 0xbf0a6f7d, 0xbfa10018,
- 0x80767874, 0x82778075,
- 0xee0a0076, 0x000c0000,
- 0x00000000, 0xee0a0076,
- 0x000c0001, 0x00010000,
- 0xee0a0076, 0x000c0002,
- 0x00020000, 0xee0a0076,
- 0x000c0003, 0x00030000,
- 0xbf8a0000, 0x7e008500,
- 0x7e028501, 0x7e048502,
- 0x7e068503, 0x807d847d,
- 0x8078ff78, 0x00000400,
- 0xbf0a6f7d, 0xbfa2ffe8,
- 0x80766e74, 0x82778075,
- 0xee0a0076, 0x000c0000,
- 0x00000000, 0xee0a0076,
- 0x000c0001, 0x00010000,
- 0xee0a0076, 0x000c0002,
- 0x00020000, 0xee0a0076,
- 0x000c0003, 0x00030000,
- 0xbf8a0000, 0xb8f83b05,
- 0x80788178, 0xbf0d9972,
- 0xbfa20002, 0x84788978,
- 0xbfa00001, 0x84788a78,
- 0x8078ff78, 0x00000200,
- 0x80f8ff78, 0x00000060,
- 0x80767874, 0x82778075,
- 0xbefd00ff, 0x0000006c,
- 0xf460403b, 0xf8000000,
- 0xbf8a0000, 0x80fd847d,
- 0xbf800000, 0xbe804300,
- 0xbe824302, 0x80f6a076,
- 0x82f78077, 0xf460603b,
+ 0x000c0002, 0x00020000,
+ 0xee0a0076, 0x000c0003,
+ 0x00030000, 0xbf8a0000,
+ 0xb8f83b05, 0x80788178,
+ 0xbf0d9972, 0xbfa20002,
+ 0x84788978, 0xbfa00001,
+ 0x84788a78, 0x8078ff78,
+ 0x00000200, 0x80f8ff78,
+ 0x00000060, 0x80767874,
+ 0x82778075, 0xbefd00ff,
+ 0x0000006c, 0xf460403b,
0xf8000000, 0xbf8a0000,
- 0x80fd887d, 0xbf800000,
+ 0x80fd847d, 0xbf800000,
0xbe804300, 0xbe824302,
- 0xbe844304, 0xbe864306,
- 0x80f6c076, 0x82f78077,
- 0xf460803b, 0xf8000000,
- 0xbf8a0000, 0x80fd907d,
+ 0x80f6a076, 0x82f78077,
+ 0xf460603b, 0xf8000000,
+ 0xbf8a0000, 0x80fd887d,
0xbf800000, 0xbe804300,
0xbe824302, 0xbe844304,
- 0xbe864306, 0xbe884308,
- 0xbe8a430a, 0xbe8c430c,
- 0xbe8e430e, 0xbf06807d,
- 0xbfa1ffef, 0xb980f801,
- 0x00000000, 0xb8f83b05,
- 0x80788178, 0xbf0d9972,
- 0xbfa20002, 0x84788978,
- 0xbfa00001, 0x84788a78,
- 0x8078ff78, 0x00000200,
- 0x80767874, 0x82778075,
- 0xbeff0071, 0xf4601bfb,
- 0xf8000000, 0xf4601b3b,
- 0xf8000004, 0xf4601b7b,
- 0xf8000008, 0xf4601c3b,
- 0xf800000c, 0xf4601c7b,
- 0xf8000010, 0xf4601ebb,
- 0xf8000014, 0xf4601efb,
- 0xf8000018, 0xf4601e7b,
- 0xf800001c, 0xf4601cfb,
- 0xf8000020, 0xf4601bbb,
- 0xf8000024, 0xbf8a0000,
- 0xb96ef814, 0xf4601bbb,
- 0xf8000028, 0xbf8a0000,
- 0xb96ef815, 0xf4601bbb,
- 0xf800002c, 0xbf8a0000,
- 0xb96ef812, 0xf4601bbb,
- 0xf8000030, 0xbf8a0000,
- 0xb96ef813, 0x8b6eff7f,
- 0x04000000, 0xbfa10022,
- 0xf4601bbb, 0xf8000038,
- 0xbf8a0000, 0xbf0d806e,
- 0xbfa1001d, 0x856e906e,
- 0x8b6e6e6e, 0xbfa10003,
- 0xbe804ec1, 0x816ec16e,
- 0xbfa0fffb, 0xbef800ff,
- 0x00000080, 0xbefd0081,
- 0xf4601bbb, 0xf0000000,
- 0xbfc70000, 0x80788478,
- 0x937eff6e, 0x00070004,
- 0x847e907e, 0x8c7d7e7d,
- 0xbe80517d, 0x917dff7d,
- 0x007f0000, 0x856e906e,
- 0x8b6e6e6e, 0xbfa10003,
- 0xbe804e7d, 0x816ec16e,
- 0xbfa0fffb, 0x807d817d,
- 0xbf08907d, 0xbfa1ffec,
- 0xf4601bbb, 0xf800003c,
- 0xbfc70000, 0xbf0d806e,
- 0xbfa1000c, 0xbf0d9a7f,
- 0xbfa10002, 0xbf068180,
- 0xbe804fc4, 0xbf94fffc,
- 0xbfa10006, 0x856e906e,
- 0x8b6e6e6e, 0xbfa10003,
- 0xbe804ec3, 0x816ec16e,
- 0xbfa0fffb, 0xbefd006f,
- 0xbefe0070, 0xbeff0071,
- 0xb979f822, 0xb97b2011,
- 0x857b867b, 0xb97b0191,
- 0x857b827b, 0xb97bba11,
- 0xb973f801, 0xb8ee3b05,
- 0x806e816e, 0xbf0d9972,
- 0xbfa20002, 0x846e896e,
- 0xbfa00001, 0x846e8a6e,
- 0x806eff6e, 0x000001c0,
- 0x806e746e, 0x826f8075,
- 0xf4605c37, 0xf8000010,
- 0xf4605d37, 0xf8000020,
- 0xf4601e77, 0xf8000034,
- 0xbf8a0000, 0x856e9677,
- 0xb96e04a1, 0x856e9577,
- 0xb96e0421, 0x856e8e77,
- 0xb96e3021, 0x8b6dff6d,
- 0x01ffffff, 0x8bfe7e7e,
- 0x8bea6a6a, 0xb97af804,
+ 0xbe864306, 0x80f6c076,
+ 0x82f78077, 0xf460803b,
+ 0xf8000000, 0xbf8a0000,
+ 0x80fd907d, 0xbf800000,
+ 0xbe804300, 0xbe824302,
+ 0xbe844304, 0xbe864306,
+ 0xbe884308, 0xbe8a430a,
+ 0xbe8c430c, 0xbe8e430e,
+ 0xbf06807d, 0xbfa1ffef,
+ 0xb980f801, 0x00000000,
+ 0xb8f83b05, 0x80788178,
+ 0xbf0d9972, 0xbfa20002,
+ 0x84788978, 0xbfa00001,
+ 0x84788a78, 0x8078ff78,
+ 0x00000200, 0x80767874,
+ 0x82778075, 0xbeff0071,
+ 0xf4601bfb, 0xf8000000,
+ 0xf4601b3b, 0xf8000004,
+ 0xf4601b7b, 0xf8000008,
+ 0xf4601c3b, 0xf800000c,
+ 0xf4601c7b, 0xf8000010,
+ 0xf4601ebb, 0xf8000014,
+ 0xf4601efb, 0xf8000018,
+ 0xf4601e7b, 0xf800001c,
+ 0xf4601cfb, 0xf8000020,
+ 0xf4601bbb, 0xf8000024,
+ 0xbf8a0000, 0xb96ef814,
+ 0xf4601bbb, 0xf8000028,
+ 0xbf8a0000, 0xb96ef815,
+ 0xf4601bbb, 0xf800002c,
+ 0xbf8a0000, 0xb96ef812,
+ 0xf4601bbb, 0xf8000030,
+ 0xbf8a0000, 0xb96ef813,
+ 0x8b6eff7f, 0x04000000,
+ 0xbfa10022, 0xf4601bbb,
+ 0xf8000038, 0xbf8a0000,
+ 0xbf0d806e, 0xbfa1001d,
+ 0x856e906e, 0x8b6e6e6e,
+ 0xbfa10003, 0xbe804ec1,
+ 0x816ec16e, 0xbfa0fffb,
+ 0xbef800ff, 0x00000080,
+ 0xbefd0081, 0xf4601bbb,
+ 0xf0000000, 0xbfc70000,
+ 0x80788478, 0x937eff6e,
+ 0x00070004, 0x847e907e,
+ 0x8c7d7e7d, 0xbe80517d,
+ 0x917dff7d, 0x007f0000,
+ 0x856e906e, 0x8b6e6e6e,
+ 0xbfa10003, 0xbe804e7d,
+ 0x816ec16e, 0xbfa0fffb,
+ 0x807d817d, 0xbf08907d,
+ 0xbfa1ffec, 0xf4601bbb,
+ 0xf800003c, 0xbfc70000,
+ 0xbf0d806e, 0xbfa1000c,
+ 0xbf0d9a7f, 0xbfa10002,
+ 0xbf068180, 0xbe804fc4,
+ 0xbf94fffc, 0xbfa10006,
+ 0x856e906e, 0x8b6e6e6e,
+ 0xbfa10003, 0xbe804ec3,
+ 0x816ec16e, 0xbfa0fffb,
+ 0xbefd006f, 0xbefe0070,
+ 0xbeff0071, 0xb979f822,
+ 0xb97b2011, 0x857b867b,
+ 0xb97b0191, 0x857b827b,
+ 0xb97bba11, 0xb973f801,
+ 0xb8ee3b05, 0x806e816e,
+ 0xbf0d9972, 0xbfa20002,
+ 0x846e896e, 0xbfa00001,
+ 0x846e8a6e, 0x806eff6e,
+ 0x000001c0, 0x806e746e,
+ 0x826f8075, 0xf4605c37,
+ 0xf8000010, 0xf4605d37,
+ 0xf8000020, 0xf4601e77,
+ 0xf8000034, 0xbf8a0000,
+ 0x856e9677, 0xb96e04a1,
+ 0x856e9577, 0xb96e0421,
+ 0x856e8e77, 0xb96e3021,
+ 0x8b6dff6d, 0x01ffffff,
+ 0x8bfe7e7e, 0x8bea6a6a,
+ 0x936eff77, 0x0002001a,
+ 0xb96ef81a, 0xb97af804,
0xb8eef802, 0xbf0c8b6e,
0xbfa20003, 0xbe804fc2,
0xbf94fffe, 0xbfa10001,
diff --git a/drivers/gpu/drm/amd/amdkfd/cwsr_trap_handler_gfx12.asm b/drivers/gpu/drm/amd/amdkfd/cwsr_trap_handler_gfx12.asm
index b7b82f1c6072..1624a02ad0ef 100644
--- a/drivers/gpu/drm/amd/amdkfd/cwsr_trap_handler_gfx12.asm
+++ b/drivers/gpu/drm/amd/amdkfd/cwsr_trap_handler_gfx12.asm
@@ -1329,6 +1329,7 @@ end
function restore_sched_mode(s_tmp)
s_bfe_u32 s_tmp, ttmp11, (TTMP11_SCHED_MODE_SHIFT | (TTMP11_SCHED_MODE_SIZE << 0x10))
s_setreg_b32 hwreg(HW_REG_WAVE_SCHED_MODE), s_tmp
+end
function restore_barrier_signal_count(barrier_id)
// extract the saved signal count from s_restore_tmp
--
2.34.1
^ permalink raw reply related [flat|nested] 14+ messages in thread
* [PATCH 2/5] drm/amdkfd: Fix scalar load ordering in gfx12.1 trap handler
2026-01-16 20:39 [PATCH 0/5] drm/amdkfd: Trap handler fixes and gfx12.1 support Jay Cornwall
2026-01-16 20:39 ` [PATCH 1/5] drm/amdkfd: Sync trap handler binary with source Jay Cornwall
@ 2026-01-16 20:39 ` Jay Cornwall
2026-01-20 22:38 ` Lancelot SIX
2026-01-16 20:39 ` [PATCH 3/5] drm/amdkfd: gfx12.1 cluster barrier context save workaround Jay Cornwall
` (2 subsequent siblings)
4 siblings, 1 reply; 14+ messages in thread
From: Jay Cornwall @ 2026-01-16 20:39 UTC (permalink / raw)
To: amd-gfx; +Cc: Jay Cornwall, Lancelot Six, Joseph Greathouse, Vladimir Indic
Scalar loads may arrive out-of-order with respect to KMCNT.
The affected code expects the two loads to arrive in-order.
Signed-off-by: Jay Cornwall <jay.cornwall@amd.com>
Cc: Lancelot Six <lancelot.six@amd.com>
Cc: Joseph Greathouse <joseph.greathouse@amd.com>
Cc: Vladimir Indic <vladimir.indic@amd.com>
---
drivers/gpu/drm/amd/amdkfd/cwsr_trap_handler.h | 8 ++++----
drivers/gpu/drm/amd/amdkfd/cwsr_trap_handler_gfx12.asm | 2 +-
2 files changed, 5 insertions(+), 5 deletions(-)
diff --git a/drivers/gpu/drm/amd/amdkfd/cwsr_trap_handler.h b/drivers/gpu/drm/amd/amdkfd/cwsr_trap_handler.h
index 6281b2f9faee..453c08845d74 100644
--- a/drivers/gpu/drm/amd/amdkfd/cwsr_trap_handler.h
+++ b/drivers/gpu/drm/amd/amdkfd/cwsr_trap_handler.h
@@ -4638,8 +4638,8 @@ static const uint32_t cwsr_trap_gfx12_1_0_hex[] = {
0x01ffffff, 0xb8fbf811,
0xbf0d847b, 0xbfa20078,
0xf4003eb6, 0xf8000000,
- 0xf4003bb6, 0xf8000008,
- 0xbfc70001, 0x8b76ff7a,
+ 0xbfc70000, 0xf4003bb6,
+ 0xf8000008, 0x8b76ff7a,
0x80000000, 0xbfa20027,
0x9376ff7a, 0x00060019,
0x81f9a376, 0xbf0b8179,
@@ -4717,8 +4717,8 @@ static const uint32_t cwsr_trap_gfx12_1_0_hex[] = {
0xb980f821, 0x00000000,
0xbf0d847b, 0xbfa20078,
0xf4003eb6, 0xf8000000,
- 0xf4003bb6, 0xf8000008,
- 0xbfc70001, 0x8b76ff7a,
+ 0xbfc70000, 0xf4003bb6,
+ 0xf8000008, 0x8b76ff7a,
0x80000000, 0xbfa20027,
0x9376ff7a, 0x00060019,
0x81f9a376, 0xbf0b8179,
diff --git a/drivers/gpu/drm/amd/amdkfd/cwsr_trap_handler_gfx12.asm b/drivers/gpu/drm/amd/amdkfd/cwsr_trap_handler_gfx12.asm
index 1624a02ad0ef..7ed4b502eb22 100644
--- a/drivers/gpu/drm/amd/amdkfd/cwsr_trap_handler_gfx12.asm
+++ b/drivers/gpu/drm/amd/amdkfd/cwsr_trap_handler_gfx12.asm
@@ -1357,8 +1357,8 @@ function fixup_vgpr_bank_selection
// ttmp[0:1]: {7b'0} PC[56:0]
// ttmp2, 3, 10, 13, 14, 15: free
s_load_b64 [ttmp14, ttmp15], [ttmp0, ttmp1], 0 scope:SCOPE_CU // Load the 2 instruction DW we are returning to
+ s_wait_kmcnt 0
s_load_b64 [ttmp2, ttmp3], [ttmp0, ttmp1], 8 scope:SCOPE_CU // Load the next 2 instruction DW, just in case
- s_wait_kmcnt 1
s_and_b32 ttmp10, ttmp14, 0x80000000 // Check bit 31 in the first DWORD
// SCC set if ttmp10 is != 0, i.e. if bit 31 == 1
s_cbranch_scc1 L_FIXUP_NOT_VOP12C // If bit 31 is 1, we are not VOP1, VOP2, or VOP3C
--
2.34.1
^ permalink raw reply related [flat|nested] 14+ messages in thread
* [PATCH 3/5] drm/amdkfd: gfx12.1 cluster barrier context save workaround
2026-01-16 20:39 [PATCH 0/5] drm/amdkfd: Trap handler fixes and gfx12.1 support Jay Cornwall
2026-01-16 20:39 ` [PATCH 1/5] drm/amdkfd: Sync trap handler binary with source Jay Cornwall
2026-01-16 20:39 ` [PATCH 2/5] drm/amdkfd: Fix scalar load ordering in gfx12.1 trap handler Jay Cornwall
@ 2026-01-16 20:39 ` Jay Cornwall
2026-01-20 23:27 ` Lancelot SIX
2026-01-16 20:39 ` [PATCH 4/5] drm/amdkfd: gfx12.1 trap handler support for expert scheduling mode Jay Cornwall
2026-01-16 20:39 ` [PATCH 5/5] drm/amdkfd: Do not include VGPR MSBs in saved PC during save Jay Cornwall
4 siblings, 1 reply; 14+ messages in thread
From: Jay Cornwall @ 2026-01-16 20:39 UTC (permalink / raw)
To: amd-gfx
Cc: Jay Cornwall, Gang Ba, Harish Kasiviswanathan, Lancelot Six,
Vladimir Indic
Trap cluster barrier may not serialize with user cluster barrier
under some circumstances. Add a check for pending user cluster
barrier complete.
Signed-off-by: Jay Cornwall <jay.cornwall@amd.com>
Tested-by: Gang Ba <Gang.Ba@amd.com>
Cc: Harish Kasiviswanathan <Harish.Kasiviswanathan@amd.com>
Cc: Lancelot Six <lancelot.six@amd.com>
Cc: Vladimir Indic <vladimir.indic@amd.com>
---
.../gpu/drm/amd/amdkfd/cwsr_trap_handler.h | 31 +++++++++-------
.../amd/amdkfd/cwsr_trap_handler_gfx12.asm | 36 +++++++++++++++----
2 files changed, 47 insertions(+), 20 deletions(-)
diff --git a/drivers/gpu/drm/amd/amdkfd/cwsr_trap_handler.h b/drivers/gpu/drm/amd/amdkfd/cwsr_trap_handler.h
index 453c08845d74..d86bccc49e3f 100644
--- a/drivers/gpu/drm/amd/amdkfd/cwsr_trap_handler.h
+++ b/drivers/gpu/drm/amd/amdkfd/cwsr_trap_handler.h
@@ -3754,11 +3754,11 @@ static const uint32_t cwsr_trap_gfx12_hex[] = {
0x84708a70, 0x8070ff70,
0x00000200, 0x7e000280,
0x7e020280, 0x7e040280,
- 0xbefd0080, 0xbe804ec2,
- 0xbf94fffe, 0xb8faf804,
- 0x8b7a847a, 0x91788478,
- 0x8c787a78, 0xd7610002,
+ 0xbefd0080, 0xd7610002,
0x0000fa71, 0x807d817d,
+ 0xbe804ec2, 0xbf94fffe,
+ 0xb8faf804, 0x8b7a847a,
+ 0x91788478, 0x8c787a78,
0xd7610002, 0x0000fa6c,
0x807d817d, 0x917aff6d,
0x80000000, 0xd7610002,
@@ -4587,7 +4587,7 @@ static const uint32_t cwsr_trap_gfx9_5_0_hex[] = {
};
static const uint32_t cwsr_trap_gfx12_1_0_hex[] = {
- 0xbfa00001, 0xbfa003aa,
+ 0xbfa00001, 0xbfa003b4,
0xb0804009, 0xb8eef81a,
0xbf880000, 0xb980081a,
0x00000000, 0xb8f8f804,
@@ -4838,15 +4838,20 @@ static const uint32_t cwsr_trap_gfx12_1_0_hex[] = {
0x84708a70, 0x8070ff70,
0x00000200, 0x7e000280,
0x7e020280, 0x7e040280,
- 0xbefd0080, 0xb8faf802,
- 0xbf0c8b7a, 0xbfa20003,
- 0xbe804fc2, 0xbf94fffe,
- 0xbfa10001, 0xbe804ec4,
- 0xbf94fffc, 0xb8faf804,
- 0x8b7aff7a, 0x0001000c,
- 0x9178ff78, 0x0001000c,
- 0x8c787a78, 0xd7610002,
+ 0xbefd0080, 0xd7610002,
0x0000fa71, 0x807d817d,
+ 0xb8faf802, 0xbf0c8b7a,
+ 0xbfa20003, 0xbe804fc2,
+ 0xbf94fffe, 0xbfa10001,
+ 0xbe804ec4, 0xbf94fffc,
+ 0xbefa4c88, 0xbfc70000,
+ 0xbf0c807a, 0xbfa20006,
+ 0x9371ff7a, 0x00070004,
+ 0x937aff7a, 0x00070010,
+ 0xbf06717a, 0xbfa2fff6,
+ 0xb8faf804, 0x8b7aff7a,
+ 0x0001000c, 0x9178ff78,
+ 0x0001000c, 0x8c787a78,
0xd7610002, 0x0000fa6c,
0x807d817d, 0x917aff6d,
0x80000000, 0xd7610002,
diff --git a/drivers/gpu/drm/amd/amdkfd/cwsr_trap_handler_gfx12.asm b/drivers/gpu/drm/amd/amdkfd/cwsr_trap_handler_gfx12.asm
index 7ed4b502eb22..ace2a9f2ac73 100644
--- a/drivers/gpu/drm/amd/amdkfd/cwsr_trap_handler_gfx12.asm
+++ b/drivers/gpu/drm/amd/amdkfd/cwsr_trap_handler_gfx12.asm
@@ -35,6 +35,7 @@
#define HAVE_BANKED_VGPRS (ASIC_FAMILY == CHIP_GC_12_0_3)
#define NUM_NAMED_BARRIERS (ASIC_FAMILY == CHIP_GC_12_0_3 ? 0x10 : 0)
#define HAVE_CLUSTER_BARRIER (ASIC_FAMILY == CHIP_GC_12_0_3)
+#define CLUSTER_BARRIER_SERIALIZE_WORKAROUND (ASIC_FAMILY == CHIP_GC_12_0_3)
#define SINGLE_STEP_MISSED_WORKAROUND 1 //workaround for lost TRAP_AFTER_INST exception when SAVECTX raised
#define HAVE_VALU_SGPR_HAZARD (ASIC_FAMILY == CHIP_GFX12)
@@ -104,6 +105,7 @@ var SQ_WAVE_SCHED_MODE_DEP_MODE_SHIFT = 0
var SQ_WAVE_SCHED_MODE_DEP_MODE_SIZE = 2
var BARRIER_STATE_SIGNAL_OFFSET = 16
+var BARRIER_STATE_SIGNAL_SIZE = 7
var BARRIER_STATE_MEMBER_OFFSET = 4
var BARRIER_STATE_MEMBER_SIZE = 7
var BARRIER_STATE_VALID_OFFSET = 0
@@ -520,9 +522,11 @@ L_SAVE_HWREG:
v_mov_b32 v2, 0x0 //Set of SGPRs for TCP store
s_mov_b32 m0, 0x0 //Next lane of v2 to write to
+ write_hwreg_to_v2(s_save_m0)
+
// Ensure no further changes to barrier or LDS state.
// STATE_PRIV.*BARRIER_COMPLETE may change up to this point.
- wait_trap_barriers(s_save_tmp)
+ wait_trap_barriers(s_save_tmp, s_save_m0, 1)
// Re-read final state of *BARRIER_COMPLETE fields for save.
s_getreg_b32 s_save_tmp, hwreg(HW_REG_WAVE_STATE_PRIV)
@@ -530,7 +534,6 @@ L_SAVE_HWREG:
s_andn2_b32 s_save_state_priv, s_save_state_priv, SQ_WAVE_STATE_PRIV_ALL_BARRIER_COMPLETE_MASK
s_or_b32 s_save_state_priv, s_save_state_priv, s_save_tmp
- write_hwreg_to_v2(s_save_m0)
write_hwreg_to_v2(s_save_pc_lo)
s_andn2_b32 s_save_tmp, s_save_pc_hi, S_SAVE_PC_HI_FIRST_WAVE_MASK
write_hwreg_to_v2(s_save_tmp)
@@ -1198,7 +1201,7 @@ L_SKIP_CLUSTER_BARRIER_RESTORE:
// Make barrier and LDS state visible to all waves in the group/cluster.
// STATE_PRIV.*BARRIER_COMPLETE may change after this point.
- wait_trap_barriers(s_restore_tmp)
+ wait_trap_barriers(s_restore_tmp, 0, 0)
#if HAVE_CLUSTER_BARRIER
// SCC is changed by wait_trap_barriers, restore it separately.
@@ -1211,7 +1214,7 @@ L_SKIP_CLUSTER_BARRIER_RESTORE:
L_END_PGM:
// Make sure that no wave of the group/cluster can exit the trap handler
// before the group/cluster barrier state is saved.
- wait_trap_barriers(s_restore_tmp)
+ wait_trap_barriers(s_restore_tmp, 0, 0)
s_endpgm_saved
end
@@ -1301,11 +1304,11 @@ function restore_xnack_state_priv(s_tmp)
end
#endif
-function wait_trap_barriers(s_tmp)
+function wait_trap_barriers(s_tmp1, s_tmp2, serialize_wa)
#if HAVE_CLUSTER_BARRIER
// If not in a WG then wave cannot use s_barrier_signal_isfirst.
- s_getreg_b32 s_tmp, hwreg(HW_REG_WAVE_STATUS)
- s_bitcmp0_b32 s_tmp, SQ_WAVE_STATUS_IN_WG_SHIFT
+ s_getreg_b32 s_tmp1, hwreg(HW_REG_WAVE_STATUS)
+ s_bitcmp0_b32 s_tmp1, SQ_WAVE_STATUS_IN_WG_SHIFT
s_cbranch_scc1 L_TRAP_CLUSTER_BARRIER_SIGNAL
s_barrier_signal_isfirst -2
@@ -1319,6 +1322,25 @@ L_TRAP_CLUSTER_BARRIER_SIGNAL:
L_SKIP_TRAP_CLUSTER_BARRIER_SIGNAL:
s_barrier_wait -4
+
+#if CLUSTER_BARRIER_SERIALIZE_WORKAROUND
+if serialize_wa
+ // Trap cluster barrier may complete with a user cluster barrier in-flight.
+ // This is indicated if user cluster member count and signal count are equal.
+L_WAIT_USER_CLUSTER_BARRIER_COMPLETE:
+ s_sendmsg_rtn_b32 s_tmp1, sendmsg(MSG_RTN_GET_CLUSTER_BARRIER_STATE)
+ s_wait_kmcnt 0
+ s_bitcmp0_b32 s_tmp1, BARRIER_STATE_VALID_OFFSET
+ s_cbranch_scc1 L_NOT_IN_CLUSTER
+
+ s_bfe_u32 s_tmp2, s_tmp1, (BARRIER_STATE_MEMBER_OFFSET | (BARRIER_STATE_MEMBER_SIZE << 0x10))
+ s_bfe_u32 s_tmp1, s_tmp1, (BARRIER_STATE_SIGNAL_OFFSET | (BARRIER_STATE_SIGNAL_SIZE << 0x10))
+ s_cmp_eq_u32 s_tmp1, s_tmp2
+ s_cbranch_scc1 L_WAIT_USER_CLUSTER_BARRIER_COMPLETE
+end
+L_NOT_IN_CLUSTER:
+#endif
+
#else
s_barrier_signal -2
s_barrier_wait -2
--
2.34.1
^ permalink raw reply related [flat|nested] 14+ messages in thread
* [PATCH 4/5] drm/amdkfd: gfx12.1 trap handler support for expert scheduling mode
2026-01-16 20:39 [PATCH 0/5] drm/amdkfd: Trap handler fixes and gfx12.1 support Jay Cornwall
` (2 preceding siblings ...)
2026-01-16 20:39 ` [PATCH 3/5] drm/amdkfd: gfx12.1 cluster barrier context save workaround Jay Cornwall
@ 2026-01-16 20:39 ` Jay Cornwall
2026-01-20 23:30 ` Lancelot SIX
2026-01-16 20:39 ` [PATCH 5/5] drm/amdkfd: Do not include VGPR MSBs in saved PC during save Jay Cornwall
4 siblings, 1 reply; 14+ messages in thread
From: Jay Cornwall @ 2026-01-16 20:39 UTC (permalink / raw)
To: amd-gfx; +Cc: Jay Cornwall, Lancelot Six, Vladimir Indic
- Leave DEP_MODE unchanged as it is ignored in the trap handler
- Save/restore SCHED_MODE (gfx12.0 saves in ttmp11)
Signed-off-by: Jay Cornwall <jay.cornwall@amd.com>
Cc: Lancelot Six <lancelot.six@amd.com>
Cc: Vladimir Indic <vladimir.indic@amd.com>
---
.../gpu/drm/amd/amdkfd/cwsr_trap_handler.h | 372 +++++++++---------
.../amd/amdkfd/cwsr_trap_handler_gfx12.asm | 32 +-
2 files changed, 214 insertions(+), 190 deletions(-)
diff --git a/drivers/gpu/drm/amd/amdkfd/cwsr_trap_handler.h b/drivers/gpu/drm/amd/amdkfd/cwsr_trap_handler.h
index d86bccc49e3f..9bb7fb6a83ed 100644
--- a/drivers/gpu/drm/amd/amdkfd/cwsr_trap_handler.h
+++ b/drivers/gpu/drm/amd/amdkfd/cwsr_trap_handler.h
@@ -4587,18 +4587,14 @@ static const uint32_t cwsr_trap_gfx9_5_0_hex[] = {
};
static const uint32_t cwsr_trap_gfx12_1_0_hex[] = {
- 0xbfa00001, 0xbfa003b4,
- 0xb0804009, 0xb8eef81a,
- 0xbf880000, 0xb980081a,
- 0x00000000, 0xb8f8f804,
- 0x9177ff77, 0x0c000000,
- 0x846e9a6e, 0x8c776e77,
+ 0xbfa00001, 0xbfa003ac,
+ 0xb0804009, 0xb8f8f804,
0x9178ff78, 0x00008c00,
0xb8fbf811, 0x8b6eff78,
0x00004000, 0xbfa10008,
0x8b6eff7b, 0x00000080,
0xbfa20018, 0x8b6ea07b,
- 0xbfa200d4, 0xbf830010,
+ 0xbfa200d1, 0xbf830010,
0xb8fbf811, 0xbfa0fffb,
0x8b6eff7b, 0x00000bd0,
0xbfa20010, 0xb8eef812,
@@ -4609,7 +4605,7 @@ static const uint32_t cwsr_trap_gfx12_1_0_hex[] = {
0xf0000000, 0xbfa20005,
0x8b6fff6f, 0x00000200,
0xbfa20002, 0x8b6ea07b,
- 0xbfa200be, 0x9177ff77,
+ 0xbfa200bb, 0x9177ff77,
0x007fc000, 0xb8fa04a1,
0x847a967a, 0x8c777a77,
0xb8fa0421, 0x847a957a,
@@ -4702,189 +4698,189 @@ static const uint32_t cwsr_trap_gfx12_1_0_hex[] = {
0xb97a0421, 0x857a8e77,
0xb97a3021, 0x8bfe7e7e,
0x8bea6a6a, 0x85788978,
- 0x936eff77, 0x0002001a,
- 0xb96ef81a, 0xb9783244,
- 0xbe804a6c, 0xb8faf802,
- 0xbf0d987a, 0xbfa10001,
- 0xbfb00000, 0x8b6dff6d,
- 0x01ffffff, 0xbefa0080,
- 0xb97a0151, 0x9177ff77,
- 0x007fc000, 0xb8fa04a1,
- 0x847a967a, 0x8c777a77,
- 0xb8fa0421, 0x847a957a,
- 0x8c777a77, 0xb8fa3021,
- 0x847a8e7a, 0x8c777a77,
- 0xb980f821, 0x00000000,
- 0xbf0d847b, 0xbfa20078,
- 0xf4003eb6, 0xf8000000,
- 0xbfc70000, 0xf4003bb6,
- 0xf8000008, 0x8b76ff7a,
- 0x80000000, 0xbfa20027,
- 0x9376ff7a, 0x00060019,
- 0x81f9a376, 0xbf0b8179,
- 0xbfa20068, 0x81f9ac76,
- 0xbf0b8179, 0xbfa20062,
- 0x81f9b776, 0xbf0b8179,
- 0xbfa2005f, 0x8b76ff7a,
- 0x000001ff, 0xbf06ff76,
- 0x000000fe, 0xbfa2005d,
- 0xbf06ff76, 0x000000ff,
- 0xbfa20057, 0xbf06ff76,
- 0x000000fa, 0xbfa20054,
- 0x81f9ff76, 0x000000e9,
- 0xbf0b8179, 0xbfa20050,
- 0x8b76ff7b, 0xffff0000,
- 0xbf06ff76, 0xbf860000,
- 0xbfa10051, 0x9376ff7b,
- 0x0002000e, 0x8b79ff7b,
- 0x00003f00, 0x85798679,
- 0x8c767976, 0xb9763b01,
- 0xbfa00049, 0x8b76ff7a,
- 0xfc000000, 0xbf06ff76,
- 0xd4000000, 0xbfa20013,
- 0xbf06ff76, 0xc8000000,
- 0xbfa20027, 0x8b76ff7a,
- 0xff000000, 0xbf06ff76,
- 0xcf000000, 0xbfa20039,
- 0x8b79ff7a, 0xffff0000,
- 0xbf06ff79, 0xcc350000,
- 0xbfa20037, 0xbf06ff79,
- 0xcc3a0000, 0xbfa20034,
- 0xbf06ff76, 0xcc000000,
- 0xbfa10031, 0x8b76ff7b,
- 0x000001ff, 0xbf06ff76,
- 0x000000ff, 0xbfa20029,
- 0xbf06ff76, 0x000000fa,
- 0xbfa20026, 0x81f6ff76,
- 0x000000e9, 0xbf0b8176,
- 0xbfa20022, 0x8b76ff7b,
- 0x0003fe00, 0xbf06ff76,
- 0x0001fe00, 0xbfa2001d,
- 0x8b76ff7b, 0x07fc0000,
- 0xbf06ff76, 0x03fc0000,
- 0xbfa20018, 0xbfa00014,
- 0x9376ff7a, 0x00040016,
- 0x81f68176, 0xbf0b8176,
- 0xbfa20012, 0x9376ff7a,
- 0x00050011, 0x81f68176,
- 0xbf0b8176, 0xbfa2000d,
+ 0xb9783244, 0xbe804a6c,
+ 0xb8faf802, 0xbf0d987a,
+ 0xbfa10001, 0xbfb00000,
+ 0x8b6dff6d, 0x01ffffff,
+ 0xbefa0080, 0xb97a0151,
+ 0x9177ff77, 0x007fc000,
+ 0xb8fa04a1, 0x847a967a,
+ 0x8c777a77, 0xb8fa0421,
+ 0x847a957a, 0x8c777a77,
+ 0xb8fa3021, 0x847a8e7a,
+ 0x8c777a77, 0xb980f821,
+ 0x00000000, 0xbf0d847b,
+ 0xbfa20078, 0xf4003eb6,
+ 0xf8000000, 0xbfc70000,
+ 0xf4003bb6, 0xf8000008,
+ 0x8b76ff7a, 0x80000000,
+ 0xbfa20027, 0x9376ff7a,
+ 0x00060019, 0x81f9a376,
+ 0xbf0b8179, 0xbfa20068,
+ 0x81f9ac76, 0xbf0b8179,
+ 0xbfa20062, 0x81f9b776,
+ 0xbf0b8179, 0xbfa2005f,
0x8b76ff7a, 0x000001ff,
+ 0xbf06ff76, 0x000000fe,
+ 0xbfa2005d, 0xbf06ff76,
+ 0x000000ff, 0xbfa20057,
+ 0xbf06ff76, 0x000000fa,
+ 0xbfa20054, 0x81f9ff76,
+ 0x000000e9, 0xbf0b8179,
+ 0xbfa20050, 0x8b76ff7b,
+ 0xffff0000, 0xbf06ff76,
+ 0xbf860000, 0xbfa10051,
+ 0x9376ff7b, 0x0002000e,
+ 0x8b79ff7b, 0x00003f00,
+ 0x85798679, 0x8c767976,
+ 0xb9763b01, 0xbfa00049,
+ 0x8b76ff7a, 0xfc000000,
+ 0xbf06ff76, 0xd4000000,
+ 0xbfa20013, 0xbf06ff76,
+ 0xc8000000, 0xbfa20027,
+ 0x8b76ff7a, 0xff000000,
+ 0xbf06ff76, 0xcf000000,
+ 0xbfa20039, 0x8b79ff7a,
+ 0xffff0000, 0xbf06ff79,
+ 0xcc350000, 0xbfa20037,
+ 0xbf06ff79, 0xcc3a0000,
+ 0xbfa20034, 0xbf06ff76,
+ 0xcc000000, 0xbfa10031,
+ 0x8b76ff7b, 0x000001ff,
0xbf06ff76, 0x000000ff,
- 0xbfa20008, 0x8b76ff7b,
+ 0xbfa20029, 0xbf06ff76,
+ 0x000000fa, 0xbfa20026,
+ 0x81f6ff76, 0x000000e9,
+ 0xbf0b8176, 0xbfa20022,
+ 0x8b76ff7b, 0x0003fe00,
+ 0xbf06ff76, 0x0001fe00,
+ 0xbfa2001d, 0x8b76ff7b,
+ 0x07fc0000, 0xbf06ff76,
+ 0x03fc0000, 0xbfa20018,
+ 0xbfa00014, 0x9376ff7a,
+ 0x00040016, 0x81f68176,
+ 0xbf0b8176, 0xbfa20012,
+ 0x9376ff7a, 0x00050011,
+ 0x81f68176, 0xbf0b8176,
+ 0xbfa2000d, 0x8b76ff7a,
0x000001ff, 0xbf06ff76,
- 0x000000ff, 0xbfa20003,
- 0xbfc70000, 0xbefb006e,
- 0xbfa0ffad, 0xbfc70000,
- 0xbefb006f, 0xbfa0ffaa,
- 0xbfc70000, 0xbeee007e,
- 0xbeef007f, 0xbefe0180,
- 0xbefe4d84, 0xbf8a0000,
- 0x8b7aff7f, 0x04000000,
- 0x847a857a, 0x8c6d7a6d,
- 0xb8eff822, 0xb980f822,
- 0x00000000, 0xb8fa2b01,
- 0x847a997a, 0x8c6d7a6d,
- 0xbefa0080, 0xb97a2b01,
- 0xbefa007e, 0x8b7bff7f,
- 0x01ffffff, 0xbefe00c1,
- 0xbeff00c1, 0xee0a407a,
- 0x000c0000, 0x00000000,
- 0x7e000280, 0xbefe007a,
- 0xbeff007b, 0xb8fb0742,
- 0x847b997b, 0xb8fa3b05,
- 0x807a817a, 0xbf0d997b,
- 0xbfa20002, 0x847a897a,
- 0xbfa00001, 0x847a8a7a,
+ 0x000000ff, 0xbfa20008,
+ 0x8b76ff7b, 0x000001ff,
+ 0xbf06ff76, 0x000000ff,
+ 0xbfa20003, 0xbfc70000,
+ 0xbefb006e, 0xbfa0ffad,
+ 0xbfc70000, 0xbefb006f,
+ 0xbfa0ffaa, 0xbfc70000,
+ 0xbeee007e, 0xbeef007f,
+ 0xbefe0180, 0xbefe4d84,
+ 0xbf8a0000, 0x8b7aff7f,
+ 0x04000000, 0x847a857a,
+ 0x8c6d7a6d, 0xb8eff822,
+ 0xb980f822, 0x00000000,
+ 0xb8fa2b01, 0x847a997a,
+ 0x8c6d7a6d, 0xbefa0080,
+ 0xb97a2b01, 0xbefa007e,
0x8b7bff7f, 0x01ffffff,
- 0x807aff7a, 0x000001c0,
- 0x807a7e7a, 0x827b807b,
- 0xd7610000, 0x00010870,
- 0xd7610000, 0x00010a71,
- 0xd7610000, 0x00010c72,
- 0xd7610000, 0x00010e73,
- 0xd7610000, 0x00011074,
- 0xd7610000, 0x00011275,
- 0xd7610000, 0x00011476,
- 0xd7610000, 0x00011677,
- 0xd7610000, 0x00011a79,
- 0xd7610000, 0x00011c7e,
- 0xd7610000, 0x00011e7f,
- 0xbefe00ff, 0x00003fff,
- 0xbeff0080, 0xee0a407a,
- 0x000c0000, 0x00000000,
- 0xd760007a, 0x00011d00,
- 0xd760007b, 0x00011f00,
+ 0xbefe00c1, 0xbeff00c1,
+ 0xee0a407a, 0x000c0000,
+ 0x00000000, 0x7e000280,
0xbefe007a, 0xbeff007b,
- 0xbef4007e, 0x8b75ff7f,
- 0x01ffffff, 0xbef1007d,
- 0xb8f30742, 0x84739973,
- 0xbefe00c1, 0x857d9973,
- 0x8b7d817d, 0xbf06817d,
- 0xbfa20002, 0xbeff0080,
- 0xbfa00002, 0xbeff00c1,
- 0xbfa0000a, 0xee0a4074,
- 0x008c0000, 0x00008000,
- 0xee0a4074, 0x010c0000,
+ 0xb8fb0742, 0x847b997b,
+ 0xb8fa3b05, 0x807a817a,
+ 0xbf0d997b, 0xbfa20002,
+ 0x847a897a, 0xbfa00001,
+ 0x847a8a7a, 0x8b7bff7f,
+ 0x01ffffff, 0x807aff7a,
+ 0x000001c0, 0x807a7e7a,
+ 0x827b807b, 0xd7610000,
+ 0x00010870, 0xd7610000,
+ 0x00010a71, 0xd7610000,
+ 0x00010c72, 0xd7610000,
+ 0x00010e73, 0xd7610000,
+ 0x00011074, 0xd7610000,
+ 0x00011275, 0xd7610000,
+ 0x00011476, 0xd7610000,
+ 0x00011677, 0xd7610000,
+ 0x00011a79, 0xd7610000,
+ 0x00011c7e, 0xd7610000,
+ 0x00011e7f, 0xbefe00ff,
+ 0x00003fff, 0xbeff0080,
+ 0xee0a407a, 0x000c0000,
+ 0x00000000, 0xd760007a,
+ 0x00011d00, 0xd760007b,
+ 0x00011f00, 0xbefe007a,
+ 0xbeff007b, 0xbef4007e,
+ 0x8b75ff7f, 0x01ffffff,
+ 0xbef1007d, 0xb8f30742,
+ 0x84739973, 0xbefe00c1,
+ 0x857d9973, 0x8b7d817d,
+ 0xbf06817d, 0xbfa20002,
+ 0xbeff0080, 0xbfa00002,
+ 0xbeff00c1, 0xbfa0000a,
+ 0xee0a4074, 0x008c0000,
+ 0x00008000, 0xee0a4074,
+ 0x010c0000, 0x00010000,
+ 0xee0a4074, 0x018c0000,
+ 0x00018000, 0xbfa00009,
+ 0xee0a4074, 0x008c0000,
0x00010000, 0xee0a4074,
- 0x018c0000, 0x00018000,
- 0xbfa00009, 0xee0a4074,
- 0x008c0000, 0x00010000,
- 0xee0a4074, 0x010c0000,
- 0x00020000, 0xee0a4074,
- 0x018c0000, 0x00030000,
- 0xb8f03b05, 0x80708170,
- 0xbf0d9973, 0xbfa20002,
- 0x84708970, 0xbfa00001,
- 0x84708a70, 0x8070ff70,
- 0x00000200, 0x7e000280,
- 0x7e020280, 0x7e040280,
- 0xbefd0080, 0xd7610002,
- 0x0000fa71, 0x807d817d,
- 0xb8faf802, 0xbf0c8b7a,
- 0xbfa20003, 0xbe804fc2,
- 0xbf94fffe, 0xbfa10001,
- 0xbe804ec4, 0xbf94fffc,
- 0xbefa4c88, 0xbfc70000,
- 0xbf0c807a, 0xbfa20006,
- 0x9371ff7a, 0x00070004,
- 0x937aff7a, 0x00070010,
- 0xbf06717a, 0xbfa2fff6,
- 0xb8faf804, 0x8b7aff7a,
- 0x0001000c, 0x9178ff78,
- 0x0001000c, 0x8c787a78,
- 0xd7610002, 0x0000fa6c,
- 0x807d817d, 0x917aff6d,
- 0x80000000, 0xd7610002,
+ 0x010c0000, 0x00020000,
+ 0xee0a4074, 0x018c0000,
+ 0x00030000, 0xb8f03b05,
+ 0x80708170, 0xbf0d9973,
+ 0xbfa20002, 0x84708970,
+ 0xbfa00001, 0x84708a70,
+ 0x8070ff70, 0x00000200,
+ 0x7e000280, 0x7e020280,
+ 0x7e040280, 0xbefd0080,
+ 0xd7610002, 0x0000fa71,
+ 0x807d817d, 0xb8faf802,
+ 0xbf0c8b7a, 0xbfa20003,
+ 0xbe804fc2, 0xbf94fffe,
+ 0xbfa10001, 0xbe804ec4,
+ 0xbf94fffc, 0xbefa4c88,
+ 0xbfc70000, 0xbf0c807a,
+ 0xbfa20006, 0x9371ff7a,
+ 0x00070004, 0x937aff7a,
+ 0x00070010, 0xbf06717a,
+ 0xbfa2fff6, 0xb8faf804,
+ 0x8b7aff7a, 0x0001000c,
+ 0x9178ff78, 0x0001000c,
+ 0x8c787a78, 0xd7610002,
+ 0x0000fa6c, 0x807d817d,
+ 0x917aff6d, 0x80000000,
+ 0xd7610002, 0x0000fa7a,
+ 0x807d817d, 0xd7610002,
+ 0x0000fa6e, 0x807d817d,
+ 0xbefa0080, 0xd7610002,
0x0000fa7a, 0x807d817d,
- 0xd7610002, 0x0000fa6e,
- 0x807d817d, 0xbefa0080,
+ 0xd7610002, 0x0000fa78,
+ 0x807d817d, 0xb8faf811,
0xd7610002, 0x0000fa7a,
0x807d817d, 0xd7610002,
- 0x0000fa78, 0x807d817d,
- 0xb8faf811, 0xd7610002,
+ 0x0000fa6f, 0x807d817d,
+ 0xb8f1f801, 0x937aff6d,
+ 0x00060019, 0x847a8c7a,
+ 0x8c717a71, 0xd7610002,
+ 0x0000fa71, 0x807d817d,
+ 0xb8f1f814, 0xd7610002,
+ 0x0000fa71, 0x807d817d,
+ 0xb8f1f815, 0xd7610002,
+ 0x0000fa71, 0x807d817d,
+ 0xb8f1f812, 0xd7610002,
+ 0x0000fa71, 0x807d817d,
+ 0xb8f1f813, 0xd7610002,
+ 0x0000fa71, 0x807d817d,
+ 0xb8faf802, 0xd7610002,
0x0000fa7a, 0x807d817d,
- 0xd7610002, 0x0000fa6f,
- 0x807d817d, 0xb8f1f801,
- 0x937aff6d, 0x00060019,
- 0x847a8c7a, 0x8c717a71,
- 0xd7610002, 0x0000fa71,
- 0x807d817d, 0xb8f1f814,
- 0xd7610002, 0x0000fa71,
- 0x807d817d, 0xb8f1f815,
- 0xd7610002, 0x0000fa71,
- 0x807d817d, 0xb8f1f812,
- 0xd7610002, 0x0000fa71,
- 0x807d817d, 0xb8f1f813,
- 0xd7610002, 0x0000fa71,
- 0x807d817d, 0xb8faf802,
+ 0xbefa50c1, 0xbfc70000,
0xd7610002, 0x0000fa7a,
- 0x807d817d, 0xbefa50c1,
+ 0x807d817d, 0xbefa4c88,
0xbfc70000, 0xd7610002,
0x0000fa7a, 0x807d817d,
- 0xbefa4c88, 0xbfc70000,
- 0xd7610002, 0x0000fa7a,
- 0x807d817d, 0xbefe00ff,
- 0x0000ffff, 0xbeff0080,
+ 0xb8faf81a, 0xd7610002,
+ 0x0000fa7a, 0x807d817d,
+ 0xbefe00c1, 0xbeff0080,
0x80767074, 0x82778075,
0xee0a4076, 0x010c0000,
0x00000000, 0xbefe00c1,
@@ -5061,7 +5057,7 @@ static const uint32_t cwsr_trap_gfx12_1_0_hex[] = {
0x018c0000, 0x00030000,
0x807d847d, 0x8070ff70,
0x00000400, 0xbf0a7b7d,
- 0xbfa2ffe9, 0xbfa00183,
+ 0xbfa2ffe9, 0xbfa00184,
0xbef4007e, 0x8b75ff7f,
0x01ffffff, 0xbef1007f,
0xb8f20742, 0x84729972,
@@ -5229,6 +5225,8 @@ static const uint32_t cwsr_trap_gfx12_1_0_hex[] = {
0x856e906e, 0x8b6e6e6e,
0xbfa10003, 0xbe804ec3,
0x816ec16e, 0xbfa0fffb,
+ 0xf4601bbb, 0xf8000040,
+ 0xbfc70000, 0xb96ef81a,
0xbefd006f, 0xbefe0070,
0xbeff0071, 0xb979f822,
0xb97b2011, 0x857b867b,
@@ -5248,19 +5246,17 @@ static const uint32_t cwsr_trap_gfx12_1_0_hex[] = {
0x856e8e77, 0xb96e3021,
0x8b6dff6d, 0x01ffffff,
0x8bfe7e7e, 0x8bea6a6a,
- 0x936eff77, 0x0002001a,
- 0xb96ef81a, 0xb97af804,
+ 0xb97af804, 0xb8eef802,
+ 0xbf0c8b6e, 0xbfa20003,
+ 0xbe804fc2, 0xbf94fffe,
+ 0xbfa10001, 0xbe804ec4,
+ 0xbf94fffc, 0x857a897a,
+ 0xb97a0244, 0xbe804a6c,
0xb8eef802, 0xbf0c8b6e,
0xbfa20003, 0xbe804fc2,
0xbf94fffe, 0xbfa10001,
0xbe804ec4, 0xbf94fffc,
- 0x857a897a, 0xb97a0244,
- 0xbe804a6c, 0xb8eef802,
- 0xbf0c8b6e, 0xbfa20003,
- 0xbe804fc2, 0xbf94fffe,
- 0xbfa10001, 0xbe804ec4,
- 0xbf94fffc, 0xbfb10000,
+ 0xbfb10000, 0xbf9f0000,
0xbf9f0000, 0xbf9f0000,
0xbf9f0000, 0xbf9f0000,
- 0xbf9f0000, 0x00000000,
};
diff --git a/drivers/gpu/drm/amd/amdkfd/cwsr_trap_handler_gfx12.asm b/drivers/gpu/drm/amd/amdkfd/cwsr_trap_handler_gfx12.asm
index ace2a9f2ac73..ccc61f60ceb3 100644
--- a/drivers/gpu/drm/amd/amdkfd/cwsr_trap_handler_gfx12.asm
+++ b/drivers/gpu/drm/amd/amdkfd/cwsr_trap_handler_gfx12.asm
@@ -36,6 +36,7 @@
#define NUM_NAMED_BARRIERS (ASIC_FAMILY == CHIP_GC_12_0_3 ? 0x10 : 0)
#define HAVE_CLUSTER_BARRIER (ASIC_FAMILY == CHIP_GC_12_0_3)
#define CLUSTER_BARRIER_SERIALIZE_WORKAROUND (ASIC_FAMILY == CHIP_GC_12_0_3)
+#define RELAXED_SCHEDULING_IN_TRAP (ASIC_FAMILY == CHIP_GFX12)
#define SINGLE_STEP_MISSED_WORKAROUND 1 //workaround for lost TRAP_AFTER_INST exception when SAVECTX raised
#define HAVE_VALU_SGPR_HAZARD (ASIC_FAMILY == CHIP_GFX12)
@@ -110,9 +111,11 @@ var BARRIER_STATE_MEMBER_OFFSET = 4
var BARRIER_STATE_MEMBER_SIZE = 7
var BARRIER_STATE_VALID_OFFSET = 0
+#if RELAXED_SCHEDULING_IN_TRAP
var TTMP11_SCHED_MODE_SHIFT = 26
var TTMP11_SCHED_MODE_SIZE = 2
var TTMP11_SCHED_MODE_MASK = 0xC000000
+#endif
var NAMED_BARRIERS_SR_OFFSET_FROM_HWREG = 0x80
var S_BARRIER_INIT_MEMBERCNT_MASK = 0x7F0000
@@ -223,18 +226,22 @@ L_JUMP_TO_RESTORE:
s_branch L_RESTORE
L_SKIP_RESTORE:
+#if RELAXED_SCHEDULING_IN_TRAP
// Assume most relaxed scheduling mode is set. Save and revert to normal mode.
s_getreg_b32 ttmp2, hwreg(HW_REG_WAVE_SCHED_MODE)
s_wait_alu 0
s_setreg_imm32_b32 hwreg(HW_REG_WAVE_SCHED_MODE, \
SQ_WAVE_SCHED_MODE_DEP_MODE_SHIFT, SQ_WAVE_SCHED_MODE_DEP_MODE_SIZE), 0
+#endif
s_getreg_b32 s_save_state_priv, hwreg(HW_REG_WAVE_STATE_PRIV) //save STATUS since we will change SCC
+#if RELAXED_SCHEDULING_IN_TRAP
// Save SCHED_MODE[1:0] into ttmp11[27:26].
s_andn2_b32 ttmp11, ttmp11, TTMP11_SCHED_MODE_MASK
s_lshl_b32 ttmp2, ttmp2, TTMP11_SCHED_MODE_SHIFT
s_or_b32 ttmp11, ttmp11, ttmp2
+#endif
// Clear SPI_PRIO: do not save with elevated priority.
// Clear ECC_ERR: prevents SQC store and triggers FATAL_HALT if setreg'd.
@@ -316,7 +323,7 @@ L_FETCH_2ND_TRAP:
s_cbranch_scc0 L_NO_SIGN_EXTEND_TMA
s_or_b32 ttmp15, ttmp15, ~ADDRESS_HI32_MASK
L_NO_SIGN_EXTEND_TMA:
-#if ASIC_FAMILY == CHIP_GFX12
+#if RELAXED_SCHEDULING_IN_TRAP
// Move SCHED_MODE[1:0] from ttmp11 to unused bits in ttmp1[27:26] (return PC_HI).
// The second-level trap will restore from ttmp1 for backwards compatibility.
s_and_b32 ttmp2, ttmp11, TTMP11_SCHED_MODE_MASK
@@ -382,8 +389,10 @@ L_EXIT_TRAP:
// Only restore fields which the trap handler changes.
s_lshr_b32 s_save_state_priv, s_save_state_priv, SQ_WAVE_STATE_PRIV_SCC_SHIFT
+#if RELAXED_SCHEDULING_IN_TRAP
// Assume relaxed scheduling mode after this point.
restore_sched_mode(ttmp2)
+#endif
s_setreg_b32 hwreg(HW_REG_WAVE_STATE_PRIV, SQ_WAVE_STATE_PRIV_SCC_SHIFT, \
SQ_WAVE_STATE_PRIV_POISON_ERR_SHIFT - SQ_WAVE_STATE_PRIV_SCC_SHIFT + 1), s_save_state_priv
@@ -591,8 +600,18 @@ L_SAVE_HWREG:
write_hwreg_to_v2(s_save_tmp)
#endif
+#if ASIC_FAMILY >= CHIP_GC_12_0_3
+ s_getreg_b32 s_save_tmp, hwreg(HW_REG_WAVE_SCHED_MODE)
+ write_hwreg_to_v2(s_save_tmp)
+#endif
+
+#if ! SAVE_TTMPS_IN_SGPR_BLOCK
// Write HWREGs with 16 VGPR lanes. TTMPs occupy space after this.
s_mov_b32 exec_lo, 0xFFFF
+#else
+ // All 128 bytes are available for HWREGs.
+ s_mov_b32 exec_lo, 0xFFFFFFFF
+#endif
s_mov_b32 exec_hi, 0x0
s_add_u32 s_save_addr_lo, s_save_base_addr_lo, s_save_mem_offset
s_addc_u32 s_save_addr_hi, s_save_base_addr_hi, 0x0
@@ -1155,6 +1174,12 @@ L_SKIP_TRAP_CLUSTER_BARRIER_SIGNAL:
L_SKIP_CLUSTER_BARRIER_RESTORE:
#endif
+#if ASIC_FAMILY >= CHIP_GC_12_0_3
+ s_load_b32 s_restore_tmp, [s_restore_addr_lo, s_restore_addr_hi], null scope:SCOPE_SYS offset:0x40
+ s_wait_kmcnt 0
+ s_setreg_b32 hwreg(HW_REG_WAVE_SCHED_MODE), s_restore_tmp
+#endif
+
s_mov_b32 m0, s_restore_m0
s_mov_b32 exec_lo, s_restore_exec_lo
s_mov_b32 exec_hi, s_restore_exec_hi
@@ -1194,8 +1219,10 @@ L_SKIP_CLUSTER_BARRIER_RESTORE:
s_and_b64 exec, exec, exec // Restore STATUS.EXECZ, not writable by s_setreg_b32
s_and_b64 vcc, vcc, vcc // Restore STATUS.VCCZ, not writable by s_setreg_b32
+#if RELAXED_SCHEDULING_IN_TRAP
// Assume relaxed scheduling mode after this point.
restore_sched_mode(s_restore_tmp)
+#endif
s_setreg_b32 hwreg(HW_REG_WAVE_STATE_PRIV), s_restore_state_priv // SCC is included, which is changed by previous salu
@@ -1347,11 +1374,12 @@ L_NOT_IN_CLUSTER:
#endif
end
-
+#if RELAXED_SCHEDULING_IN_TRAP
function restore_sched_mode(s_tmp)
s_bfe_u32 s_tmp, ttmp11, (TTMP11_SCHED_MODE_SHIFT | (TTMP11_SCHED_MODE_SIZE << 0x10))
s_setreg_b32 hwreg(HW_REG_WAVE_SCHED_MODE), s_tmp
end
+#endif
function restore_barrier_signal_count(barrier_id)
// extract the saved signal count from s_restore_tmp
--
2.34.1
^ permalink raw reply related [flat|nested] 14+ messages in thread
* [PATCH 5/5] drm/amdkfd: Do not include VGPR MSBs in saved PC during save
2026-01-16 20:39 [PATCH 0/5] drm/amdkfd: Trap handler fixes and gfx12.1 support Jay Cornwall
` (3 preceding siblings ...)
2026-01-16 20:39 ` [PATCH 4/5] drm/amdkfd: gfx12.1 trap handler support for expert scheduling mode Jay Cornwall
@ 2026-01-16 20:39 ` Jay Cornwall
4 siblings, 0 replies; 14+ messages in thread
From: Jay Cornwall @ 2026-01-16 20:39 UTC (permalink / raw)
To: amd-gfx; +Cc: Lancelot Six, Alexey Kondratiev, Jay Cornwall, Vladimir Indic
From: Lancelot Six <lancelot.six@amd.com>
The current trap handler uses the top bits of ttmp1 to store a copy of
sq_wave_mode.*vgpr_msb (except for src2_vgpr_msb). This is so the
effective values in sq_wave_mode can be cleared to ensure correct
behavior of the trap handler.
When saving sq_wave_mode, the trap handler correctly rebuilds the
expected value (with *vgpr_msb restored), so the save area is correct.
However, the PC itself is copied from ttmp[0:1], which contains the
wave's PC as well as the saved MSBs.
The debugger reads the PC from the save area and is confused when non-0
values from VGPR_MSBs are present.
This patch fixes this by saving the PC in the save area's PC slot, not
the composite of the PC and VGPR_MSBs. On restore, the VGPR_MSBs are
restored from sq_wave_mode.
Signed-off-by: Lancelot Six <lancelot.six@amd.com>
Tested-by: Alexey Kondratiev <Alexey.Kondratiev@amd.com>
Reviewed-by: Jay Cornwall <jay.cornwall@amd.com>
Cc: Vladimir Indic <vladimir.indic@amd.com>
---
drivers/gpu/drm/amd/amdkfd/cwsr_trap_handler.h | 6 +++---
drivers/gpu/drm/amd/amdkfd/cwsr_trap_handler_gfx12.asm | 2 +-
2 files changed, 4 insertions(+), 4 deletions(-)
diff --git a/drivers/gpu/drm/amd/amdkfd/cwsr_trap_handler.h b/drivers/gpu/drm/amd/amdkfd/cwsr_trap_handler.h
index 9bb7fb6a83ed..39bdc98b8b6d 100644
--- a/drivers/gpu/drm/amd/amdkfd/cwsr_trap_handler.h
+++ b/drivers/gpu/drm/amd/amdkfd/cwsr_trap_handler.h
@@ -3760,8 +3760,8 @@ static const uint32_t cwsr_trap_gfx12_hex[] = {
0xb8faf804, 0x8b7a847a,
0x91788478, 0x8c787a78,
0xd7610002, 0x0000fa6c,
- 0x807d817d, 0x917aff6d,
- 0x80000000, 0xd7610002,
+ 0x807d817d, 0x8b7aff6d,
+ 0x0000ffff, 0xd7610002,
0x0000fa7a, 0x807d817d,
0xd7610002, 0x0000fa6e,
0x807d817d, 0xd7610002,
@@ -4848,7 +4848,7 @@ static const uint32_t cwsr_trap_gfx12_1_0_hex[] = {
0x9178ff78, 0x0001000c,
0x8c787a78, 0xd7610002,
0x0000fa6c, 0x807d817d,
- 0x917aff6d, 0x80000000,
+ 0x8b7aff6d, 0x01ffffff,
0xd7610002, 0x0000fa7a,
0x807d817d, 0xd7610002,
0x0000fa6e, 0x807d817d,
diff --git a/drivers/gpu/drm/amd/amdkfd/cwsr_trap_handler_gfx12.asm b/drivers/gpu/drm/amd/amdkfd/cwsr_trap_handler_gfx12.asm
index ccc61f60ceb3..c33e7660d8f4 100644
--- a/drivers/gpu/drm/amd/amdkfd/cwsr_trap_handler_gfx12.asm
+++ b/drivers/gpu/drm/amd/amdkfd/cwsr_trap_handler_gfx12.asm
@@ -544,7 +544,7 @@ L_SAVE_HWREG:
s_or_b32 s_save_state_priv, s_save_state_priv, s_save_tmp
write_hwreg_to_v2(s_save_pc_lo)
- s_andn2_b32 s_save_tmp, s_save_pc_hi, S_SAVE_PC_HI_FIRST_WAVE_MASK
+ s_and_b32 s_save_tmp, s_save_pc_hi, ADDRESS_HI32_MASK
write_hwreg_to_v2(s_save_tmp)
write_hwreg_to_v2(s_save_exec_lo)
#if WAVE32_ONLY
--
2.34.1
^ permalink raw reply related [flat|nested] 14+ messages in thread
* Re: [PATCH 1/5] drm/amdkfd: Sync trap handler binary with source
2026-01-16 20:39 ` [PATCH 1/5] drm/amdkfd: Sync trap handler binary with source Jay Cornwall
@ 2026-01-20 22:34 ` Lancelot SIX
2026-01-21 10:27 ` Indic, Vladimir
0 siblings, 1 reply; 14+ messages in thread
From: Lancelot SIX @ 2026-01-20 22:34 UTC (permalink / raw)
To: Jay Cornwall, amd-gfx; +Cc: Vladimir Indic
Hi,
This looks good to me, thanks.
On 16/01/2026 20:39, Jay Cornwall wrote:
> Binary and source desynced during branch activity. Source merge
> also introduced compile error.
>
> Signed-off-by: Jay Cornwall <jay.cornwall@amd.com>
> Cc: Lancelot Six <lancelot.six@amd.com>
> Cc: Vladimir Indic <vladimir.indic@amd.com>
Reviewed-by: Lancelot Six<lancelot.six@amd.com>
^ permalink raw reply [flat|nested] 14+ messages in thread
* Re: [PATCH 2/5] drm/amdkfd: Fix scalar load ordering in gfx12.1 trap handler
2026-01-16 20:39 ` [PATCH 2/5] drm/amdkfd: Fix scalar load ordering in gfx12.1 trap handler Jay Cornwall
@ 2026-01-20 22:38 ` Lancelot SIX
2026-01-21 10:32 ` Indic, Vladimir
0 siblings, 1 reply; 14+ messages in thread
From: Lancelot SIX @ 2026-01-20 22:38 UTC (permalink / raw)
To: Jay Cornwall, amd-gfx; +Cc: Joseph Greathouse, Vladimir Indic
Hi,
This looks good, thanks for fixing this.
Thanks,
Lancelot.
On 16/01/2026 20:39, Jay Cornwall wrote:
> Scalar loads may arrive out-of-order with respect to KMCNT.
> The affected code expects the two loads to arrive in-order.
>
> Signed-off-by: Jay Cornwall <jay.cornwall@amd.com>
> Cc: Lancelot Six <lancelot.six@amd.com>
> Cc: Joseph Greathouse <joseph.greathouse@amd.com>
> Cc: Vladimir Indic <vladimir.indic@amd.com>
Reviewed-by: Lancelot Six <lancelot.six@amd.com>
^ permalink raw reply [flat|nested] 14+ messages in thread
* Re: [PATCH 3/5] drm/amdkfd: gfx12.1 cluster barrier context save workaround
2026-01-16 20:39 ` [PATCH 3/5] drm/amdkfd: gfx12.1 cluster barrier context save workaround Jay Cornwall
@ 2026-01-20 23:27 ` Lancelot SIX
2026-01-21 10:37 ` Indic, Vladimir
0 siblings, 1 reply; 14+ messages in thread
From: Lancelot SIX @ 2026-01-20 23:27 UTC (permalink / raw)
To: Jay Cornwall, amd-gfx; +Cc: Gang Ba, Harish Kasiviswanathan, Vladimir Indic
Hi,
On 16/01/2026 20:39, Jay Cornwall wrote:
> Trap cluster barrier may not serialize with user cluster barrier
> under some circumstances. Add a check for pending user cluster
> barrier complete.
>
> Signed-off-by: Jay Cornwall <jay.cornwall@amd.com>
> Tested-by: Gang Ba <Gang.Ba@amd.com>
> Cc: Harish Kasiviswanathan <Harish.Kasiviswanathan@amd.com>
> Cc: Lancelot Six <lancelot.six@amd.com>
> Cc: Vladimir Indic <vladimir.indic@amd.com>
To the best of my understanding, this looks OK. Thanks.
Best,
Lancelot.
Reviewed-by: Lancelot Six <lancelot.six@amd.com>
^ permalink raw reply [flat|nested] 14+ messages in thread
* Re: [PATCH 4/5] drm/amdkfd: gfx12.1 trap handler support for expert scheduling mode
2026-01-16 20:39 ` [PATCH 4/5] drm/amdkfd: gfx12.1 trap handler support for expert scheduling mode Jay Cornwall
@ 2026-01-20 23:30 ` Lancelot SIX
2026-01-21 10:46 ` Indic, Vladimir
0 siblings, 1 reply; 14+ messages in thread
From: Lancelot SIX @ 2026-01-20 23:30 UTC (permalink / raw)
To: Jay Cornwall, amd-gfx; +Cc: Vladimir Indic
Hi,
Thanks, that looks good to me. Thanks.
Best,
Lancelot.
On 16/01/2026 20:39, Jay Cornwall wrote:
> - Leave DEP_MODE unchanged as it is ignored in the trap handler
> - Save/restore SCHED_MODE (gfx12.0 saves in ttmp11)
>
> Signed-off-by: Jay Cornwall <jay.cornwall@amd.com>
> Cc: Lancelot Six <lancelot.six@amd.com>
> Cc: Vladimir Indic <vladimir.indic@amd.com>
Reviewed-by: Lancelot Six <lancelot.six@amd.com>
^ permalink raw reply [flat|nested] 14+ messages in thread
* RE: [PATCH 1/5] drm/amdkfd: Sync trap handler binary with source
2026-01-20 22:34 ` Lancelot SIX
@ 2026-01-21 10:27 ` Indic, Vladimir
0 siblings, 0 replies; 14+ messages in thread
From: Indic, Vladimir @ 2026-01-21 10:27 UTC (permalink / raw)
To: Six, Lancelot, Cornwall, Jay, amd-gfx@lists.freedesktop.org
[AMD Official Use Only - AMD Internal Distribution Only]
Adding one more review, the patch LGTM, thanks!
Reviewed-by: Vladimir Indic<vladimir.indic@amd.com>
-----Original Message-----
From: Six, Lancelot <Lancelot.Six@amd.com>
Sent: Tuesday, January 20, 2026 11:35 PM
To: Cornwall, Jay <Jay.Cornwall@amd.com>; amd-gfx@lists.freedesktop.org
Cc: Indic, Vladimir <Vladimir.Indic@amd.com>
Subject: Re: [PATCH 1/5] drm/amdkfd: Sync trap handler binary with source
Hi,
This looks good to me, thanks.
On 16/01/2026 20:39, Jay Cornwall wrote:
> Binary and source desynced during branch activity. Source merge also
> introduced compile error.
>
> Signed-off-by: Jay Cornwall <jay.cornwall@amd.com>
> Cc: Lancelot Six <lancelot.six@amd.com>
> Cc: Vladimir Indic <vladimir.indic@amd.com>
Reviewed-by: Lancelot Six<lancelot.six@amd.com>
^ permalink raw reply [flat|nested] 14+ messages in thread
* RE: [PATCH 2/5] drm/amdkfd: Fix scalar load ordering in gfx12.1 trap handler
2026-01-20 22:38 ` Lancelot SIX
@ 2026-01-21 10:32 ` Indic, Vladimir
0 siblings, 0 replies; 14+ messages in thread
From: Indic, Vladimir @ 2026-01-21 10:32 UTC (permalink / raw)
To: Six, Lancelot, Cornwall, Jay, amd-gfx@lists.freedesktop.org
Cc: Greathouse, Joseph
[AMD Official Use Only - AMD Internal Distribution Only]
Adding one more review. LGTM!
Reviewed-by: Vladimir Indic <vladimir.indic@amd.com>
-----Original Message-----
From: Six, Lancelot <Lancelot.Six@amd.com>
Sent: Tuesday, January 20, 2026 11:38 PM
To: Cornwall, Jay <Jay.Cornwall@amd.com>; amd-gfx@lists.freedesktop.org
Cc: Greathouse, Joseph <Joseph.Greathouse@amd.com>; Indic, Vladimir <Vladimir.Indic@amd.com>
Subject: Re: [PATCH 2/5] drm/amdkfd: Fix scalar load ordering in gfx12.1 trap handler
Hi,
This looks good, thanks for fixing this.
Thanks,
Lancelot.
On 16/01/2026 20:39, Jay Cornwall wrote:
> Scalar loads may arrive out-of-order with respect to KMCNT.
> The affected code expects the two loads to arrive in-order.
>
> Signed-off-by: Jay Cornwall <jay.cornwall@amd.com>
> Cc: Lancelot Six <lancelot.six@amd.com>
> Cc: Joseph Greathouse <joseph.greathouse@amd.com>
> Cc: Vladimir Indic <vladimir.indic@amd.com>
Reviewed-by: Lancelot Six <lancelot.six@amd.com>
^ permalink raw reply [flat|nested] 14+ messages in thread
* RE: [PATCH 3/5] drm/amdkfd: gfx12.1 cluster barrier context save workaround
2026-01-20 23:27 ` Lancelot SIX
@ 2026-01-21 10:37 ` Indic, Vladimir
0 siblings, 0 replies; 14+ messages in thread
From: Indic, Vladimir @ 2026-01-21 10:37 UTC (permalink / raw)
To: Six, Lancelot, Cornwall, Jay, amd-gfx@lists.freedesktop.org
Cc: Ba, Gang, Kasiviswanathan, Harish
[AMD Official Use Only - AMD Internal Distribution Only]
One more
Reviewed-by: Vladimir Indic <vladimir.indic@amd.com>
-----Original Message-----
From: Six, Lancelot <Lancelot.Six@amd.com>
Sent: Wednesday, January 21, 2026 12:27 AM
To: Cornwall, Jay <Jay.Cornwall@amd.com>; amd-gfx@lists.freedesktop.org
Cc: Ba, Gang <Gang.Ba@amd.com>; Kasiviswanathan, Harish <Harish.Kasiviswanathan@amd.com>; Indic, Vladimir <Vladimir.Indic@amd.com>
Subject: Re: [PATCH 3/5] drm/amdkfd: gfx12.1 cluster barrier context save workaround
Hi,
On 16/01/2026 20:39, Jay Cornwall wrote:
> Trap cluster barrier may not serialize with user cluster barrier under
> some circumstances. Add a check for pending user cluster barrier
> complete.
>
> Signed-off-by: Jay Cornwall <jay.cornwall@amd.com>
> Tested-by: Gang Ba <Gang.Ba@amd.com>
> Cc: Harish Kasiviswanathan <Harish.Kasiviswanathan@amd.com>
> Cc: Lancelot Six <lancelot.six@amd.com>
> Cc: Vladimir Indic <vladimir.indic@amd.com>
To the best of my understanding, this looks OK. Thanks.
Best,
Lancelot.
Reviewed-by: Lancelot Six <lancelot.six@amd.com>
^ permalink raw reply [flat|nested] 14+ messages in thread
* RE: [PATCH 4/5] drm/amdkfd: gfx12.1 trap handler support for expert scheduling mode
2026-01-20 23:30 ` Lancelot SIX
@ 2026-01-21 10:46 ` Indic, Vladimir
0 siblings, 0 replies; 14+ messages in thread
From: Indic, Vladimir @ 2026-01-21 10:46 UTC (permalink / raw)
To: Six, Lancelot, Cornwall, Jay, amd-gfx@lists.freedesktop.org
[AMD Official Use Only - AMD Internal Distribution Only]
On more review, LGTM! Thanks!
Reviewed-by: Vladimir Indic <vladimir.indic@amd.com>
-----Original Message-----
From: Six, Lancelot <Lancelot.Six@amd.com>
Sent: Wednesday, January 21, 2026 12:30 AM
To: Cornwall, Jay <Jay.Cornwall@amd.com>; amd-gfx@lists.freedesktop.org
Cc: Indic, Vladimir <Vladimir.Indic@amd.com>
Subject: Re: [PATCH 4/5] drm/amdkfd: gfx12.1 trap handler support for expert scheduling mode
Hi,
Thanks, that looks good to me. Thanks.
Best,
Lancelot.
On 16/01/2026 20:39, Jay Cornwall wrote:
> - Leave DEP_MODE unchanged as it is ignored in the trap handler
> - Save/restore SCHED_MODE (gfx12.0 saves in ttmp11)
>
> Signed-off-by: Jay Cornwall <jay.cornwall@amd.com>
> Cc: Lancelot Six <lancelot.six@amd.com>
> Cc: Vladimir Indic <vladimir.indic@amd.com>
Reviewed-by: Lancelot Six <lancelot.six@amd.com>
^ permalink raw reply [flat|nested] 14+ messages in thread
end of thread, other threads:[~2026-01-21 10:46 UTC | newest]
Thread overview: 14+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2026-01-16 20:39 [PATCH 0/5] drm/amdkfd: Trap handler fixes and gfx12.1 support Jay Cornwall
2026-01-16 20:39 ` [PATCH 1/5] drm/amdkfd: Sync trap handler binary with source Jay Cornwall
2026-01-20 22:34 ` Lancelot SIX
2026-01-21 10:27 ` Indic, Vladimir
2026-01-16 20:39 ` [PATCH 2/5] drm/amdkfd: Fix scalar load ordering in gfx12.1 trap handler Jay Cornwall
2026-01-20 22:38 ` Lancelot SIX
2026-01-21 10:32 ` Indic, Vladimir
2026-01-16 20:39 ` [PATCH 3/5] drm/amdkfd: gfx12.1 cluster barrier context save workaround Jay Cornwall
2026-01-20 23:27 ` Lancelot SIX
2026-01-21 10:37 ` Indic, Vladimir
2026-01-16 20:39 ` [PATCH 4/5] drm/amdkfd: gfx12.1 trap handler support for expert scheduling mode Jay Cornwall
2026-01-20 23:30 ` Lancelot SIX
2026-01-21 10:46 ` Indic, Vladimir
2026-01-16 20:39 ` [PATCH 5/5] drm/amdkfd: Do not include VGPR MSBs in saved PC during save Jay Cornwall
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox