AMD-GFX Archive on lore.kernel.org
 help / color / mirror / Atom feed
From: James Zhu <jamesz@amd.com>
To: "Kim, Jonathan" <Jonathan.Kim@amd.com>,
	"Zhu, James" <James.Zhu@amd.com>,
	 "amd-gfx@lists.freedesktop.org" <amd-gfx@lists.freedesktop.org>
Cc: "Six, Lancelot" <Lancelot.Six@amd.com>
Subject: Re: [PATCH] drm/amdkfd: add mask for debug trap flag setting
Date: Mon, 19 Jan 2026 15:57:39 -0500	[thread overview]
Message-ID: <4012256e-1553-4737-95dc-732be663ff88@amd.com> (raw)
In-Reply-To: <CY8PR12MB74355E60D1AF872BD2E6E9DA8588A@CY8PR12MB7435.namprd12.prod.outlook.com>

[-- Attachment #1: Type: text/plain, Size: 10678 bytes --]

Hi Jon,

thanks!

Answers are in line.

Best Regards!

James Zhu

On 2026-01-19 14:28, Kim, Jonathan wrote:
> [Public]
>
>> -----Original Message-----
>> From: Zhu, James<James.Zhu@amd.com>
>> Sent: Monday, January 19, 2026 12:15 PM
>> To:amd-gfx@lists.freedesktop.org
>> Cc: Six, Lancelot<Lancelot.Six@amd.com>; Kim, Jonathan
>> <Jonathan.Kim@amd.com>; Zhu, James<James.Zhu@amd.com>
>> Subject: [PATCH] drm/amdkfd: add mask for debug trap flag setting
>>
>> to specify which bits are valid setting on flags.
> Can you elaborate?
> Without mask setting, Need read current flags 1st, any update flags'

specified bits, others bits keep unchanged, then send absolute flags 
value to  system.

mask can indicate which bits in flags are need to been touched, we can 
put value(0 or 1) into

those specified bits, and send to system directly without read back 
operation.
For example. flags = 0x2 mask =0x3, will keep MSB31-MSB2 unchanged, LSB1 
set to 1,LSB0 set 0.

Will update  messages to:

*/apply masked bit update on debug trap flags/*

*/The implementation uses an XOR-AND-XOR pattern to atomically update 
bits within the 'set_flags->mask' range./*

*/Logic: /*

*/- Difference: (target_flags->flags ^ set_flags->flags) identifies bit 
mismatches. /*

*/- Filtering: The result is masked by 'set_flags->mask' to restrict 
changes. /*

*/- Application: XORing back to 'target_flags->flags' flips only the 
mismatched bits to the desired state./*

*/This approach avoids multiple assignment steps and preserves all bits 
outside of the mask./*

>
>> Signed-off-by: James Zhu<James.Zhu@amd.com>
>> ---
>>   drivers/gpu/drm/amd/amdkfd/kfd_chardev.c |  2 +-
>>   drivers/gpu/drm/amd/amdkfd/kfd_debug.c   | 48 ++++++++++++------------
>>   drivers/gpu/drm/amd/amdkfd/kfd_debug.h   |  3 +-
>>   include/uapi/linux/kfd_ioctl.h           |  3 +-
>>   4 files changed, 30 insertions(+), 26 deletions(-)
>>
>> diff --git a/drivers/gpu/drm/amd/amdkfd/kfd_chardev.c
>> b/drivers/gpu/drm/amd/amdkfd/kfd_chardev.c
>> index 79586abad7fd..fd43ef7c9076 100644
>> --- a/drivers/gpu/drm/amd/amdkfd/kfd_chardev.c
>> +++ b/drivers/gpu/drm/amd/amdkfd/kfd_chardev.c
>> @@ -3377,7 +3377,7 @@ static int kfd_ioctl_set_debug_trap(struct file *filep,
>> struct kfd_process *p, v
>>                                args->clear_node_address_watch.id);
>>                break;
>>        case KFD_IOC_DBG_TRAP_SET_FLAGS:
>> -             r = kfd_dbg_trap_set_flags(target, &args->set_flags.flags);
>> +             r = kfd_dbg_trap_set_flags(target, &args->set_flags);
>>                break;
>>        case KFD_IOC_DBG_TRAP_QUERY_DEBUG_EVENT:
>>                r = kfd_dbg_ev_query_debug_event(target,
>> diff --git a/drivers/gpu/drm/amd/amdkfd/kfd_debug.c
>> b/drivers/gpu/drm/amd/amdkfd/kfd_debug.c
>> index a04875236647..279160ca71a9 100644
>> --- a/drivers/gpu/drm/amd/amdkfd/kfd_debug.c
>> +++ b/drivers/gpu/drm/amd/amdkfd/kfd_debug.c
>> @@ -512,38 +512,42 @@ static void
>> kfd_dbg_clear_process_address_watch(struct kfd_process *target)
>>                        kfd_dbg_trap_clear_dev_address_watch(target-
>>> pdds[i], j);
>>   }
>>
>> -int kfd_dbg_trap_set_flags(struct kfd_process *target, uint32_t *flags)
>> +int kfd_dbg_trap_set_flags(struct kfd_process *target,
>> +     struct kfd_ioctl_dbg_trap_set_flags_args *set_flags)
>>   {
>>        uint32_t prev_flags = target->dbg_flags;
>>        int i, r = 0, rewind_count = 0;
>>
>> +     if (!((set_flags->flags ^ prev_flags) & set_flags->mask))
> Does this block old debuggers from setting debug mode if pad is 0?
> [JZ] Yes, this logical can been moved down after target->dbg_flags gets updated
> if (target->dbg_flags == prev_flags) return 0;
>> +             return 0;
>> +
>>        for (i = 0; i < target->n_pdds; i++) {
>>                struct kfd_topology_device *topo_dev =
>>                                kfd_topology_device_by_id(target->pdds[i]-
>>> dev->id);
>>                uint32_t caps = topo_dev->node_props.capability;
>>                uint32_t caps2 = topo_dev->node_props.capability2;
>> +             struct amdgpu_device *adev = target->pdds[i]->dev->adev;
>>
>> -             if (!(caps &
>> HSA_CAP_TRAP_DEBUG_PRECISE_MEMORY_OPERATIONS_SUPPORTED) &&
>> -                     (*flags & KFD_DBG_TRAP_FLAG_SINGLE_MEM_OP)) {
>> -                     *flags = prev_flags;
>> -                     return -EACCES;
>> -             }
>> -
>> -             if (!(caps &
>> HSA_CAP_TRAP_DEBUG_PRECISE_ALU_OPERATIONS_SUPPORTED) &&
>> -                 (*flags & KFD_DBG_TRAP_FLAG_SINGLE_ALU_OP)) {
>> -                     *flags = prev_flags;
>> -                     return -EACCES;
>> -             }
>> -
>> -             if (!(caps2 &
>> HSA_CAP2_TRAP_DEBUG_LDS_OUT_OF_ADDR_RANGE_SUPPORTED) &&
>> -                 (*flags &
>> KFD_DBG_TRAP_FLAG_LDS_OUT_OF_ADDR_RANGE)) {
>> -                     *flags = prev_flags;
>> +             if (set_flags->mask == 0xFFFFFFFF && !set_flags->flags)
>> +                     break;
> Seems like this is only for a deactivation system call.
> Probably don't want to let users abuse this with a magic number then.
> Maybe breakout into new static call in deactivation or just device loop deactivate directly there since this logic is starting to get complicated.
[JZ] Yes, only for deactivation system call.
>
>> +             if ((!(caps &
>> HSA_CAP_TRAP_DEBUG_PRECISE_MEMORY_OPERATIONS_SUPPORTED) &&
>> +                     (set_flags->mask &
>> KFD_DBG_TRAP_FLAG_SINGLE_MEM_OP)) ||
>> +                     (!(caps &
>> HSA_CAP_TRAP_DEBUG_PRECISE_ALU_OPERATIONS_SUPPORTED) &&
>> +                 (set_flags->mask & KFD_DBG_TRAP_FLAG_SINGLE_ALU_OP))
>> ||
>> +                     (!(caps2 &
>> HSA_CAP2_TRAP_DEBUG_LDS_OUT_OF_ADDR_RANGE_SUPPORTED) &&
>> +                 (set_flags->mask &
>> KFD_DBG_TRAP_FLAG_LDS_OUT_OF_ADDR_RANGE))) {
> Please vertically indent align "caps" and "set_flags" underneath each other for legibility.
[JZ] sure
>
>> +                     set_flags->flags = prev_flags;
>> +                     dev_dbg(adev->dev, "Debug Trap set mask 0x%x caps
>> 0x%x caps2 0x%x",
>> +                             set_flags->mask, caps, caps2);
>>                        return -EACCES;
>>                }
>>        }
>>
>> -     target->dbg_flags = *flags;
>> -     *flags = prev_flags;
>> +     target->dbg_flags ^= (target->dbg_flags ^ set_flags->flags) & set_flags-
>>> mask;
> I assume we can only set/unset 1 flag at a time, which is why this works.
> Otherwise, maybe I don't have enough background on this, but please do elaborate.

[JZ] Here is not just for set/unset 1 bit. ^ is bitwise XOR.

target->dbg_flags ^ set_flags->flags can find which bits are changed

(set those bits to 1 if not equal), & set_flags-mask can filter out 
unrelated bits.

target->dbg_flags ^= can change related bits accordingly.

> mask

>
>> +     pr_debug("Debug Trap previous flags 0x%x set flags 0x%x set mask 0x%x
>> target flags 0x%x",
>> +             prev_flags, set_flags->flags, set_flags->mask, target-
>>> dbg_flags);
>> +
>> +     set_flags->flags = prev_flags;
>>        for (i = 0; i < target->n_pdds; i++) {
>>                struct kfd_process_device *pdd = target->pdds[i];
>>
>> @@ -555,10 +559,8 @@ int kfd_dbg_trap_set_flags(struct kfd_process *target,
>> uint32_t *flags)
>>                else
>>                        r = kfd_dbg_set_mes_debug_mode(pdd, true);
>>
>> -             if (r) {
>> -                     target->dbg_flags = prev_flags;
> Doesn't rewind require the previous value for something to rewind to?
[JZ] Yes, since after break,  it will rewind flags over there when check 
r value at the end.
>
>> +             if (r)
>>                        break;
>> -             }
>>
>>                rewind_count++;
>>        }
>> @@ -596,7 +598,7 @@ void kfd_dbg_trap_deactivate(struct kfd_process *target,
>> bool unwind, int unwind
>>        int i;
>>
>>        if (!unwind) {
>> -             uint32_t flags = 0;
>> +             struct kfd_ioctl_dbg_trap_set_flags_args set_flags = {0,
>> 0xFFFFFFFF};
> Similar to 3 comments above.  Just call a new static or loop directly to reset debug mode (i.e. all flags clear).
> i.e. you don't have to rewind on preemption failure here because deactivation itself is a rewind we're pretty much out of luck at this point if we fail.
[JZ] I will send out v2 soon, See if your concern gets addressed or not.
>
> Jon
>
>>                int resume_count = resume_queues(target, 0, NULL);
>>
>>                if (resume_count)
>> @@ -606,7 +608,7 @@ void kfd_dbg_trap_deactivate(struct kfd_process *target,
>> bool unwind, int unwind
>>                kfd_dbg_clear_process_address_watch(target);
>>                kfd_dbg_trap_set_wave_launch_mode(target, 0);
>>
>> -             kfd_dbg_trap_set_flags(target, &flags);
>> +             kfd_dbg_trap_set_flags(target, &set_flags);
>>        }
>>
>>        for (i = 0; i < target->n_pdds; i++) {
>> diff --git a/drivers/gpu/drm/amd/amdkfd/kfd_debug.h
>> b/drivers/gpu/drm/amd/amdkfd/kfd_debug.h
>> index 894753818cba..34d27eb500a5 100644
>> --- a/drivers/gpu/drm/amd/amdkfd/kfd_debug.h
>> +++ b/drivers/gpu/drm/amd/amdkfd/kfd_debug.h
>> @@ -62,7 +62,8 @@ int kfd_dbg_trap_set_dev_address_watch(struct
>> kfd_process_device *pdd,
>>                                        uint32_t watch_address_mask,
>>                                        uint32_t *watch_id,
>>                                        uint32_t watch_mode);
>> -int kfd_dbg_trap_set_flags(struct kfd_process *target, uint32_t *flags);
>> +int kfd_dbg_trap_set_flags(struct kfd_process *target,
>> +             struct kfd_ioctl_dbg_trap_set_flags_args *set_flags);
>>   int kfd_dbg_trap_query_exception_info(struct kfd_process *target,
>>                uint32_t source_id,
>>                uint32_t exception_code,
>> diff --git a/include/uapi/linux/kfd_ioctl.h b/include/uapi/linux/kfd_ioctl.h
>> index e9b756ca228c..0522fe7344e4 100644
>> --- a/include/uapi/linux/kfd_ioctl.h
>> +++ b/include/uapi/linux/kfd_ioctl.h
>> @@ -1515,6 +1515,7 @@ struct
>> kfd_ioctl_dbg_trap_clear_node_address_watch_args {
>>    *     Sets flags for wave behaviour.
>>    *
>>    *     @flags (IN/OUT) - IN = flags to enable, OUT = flags previously enabled
>> + *     @mask  (IN)     - IN = specified debug trap operation bits on flag
>>    *
>>    *     Generic errors apply (see kfd_dbg_trap_operations).
>>    *     Return - 0 on SUCCESS.
>> @@ -1522,7 +1523,7 @@ struct
>> kfd_ioctl_dbg_trap_clear_node_address_watch_args {
>>    */
>>   struct kfd_ioctl_dbg_trap_set_flags_args {
>>        __u32 flags;
>> -     __u32 pad;
>> +     __u32 mask;
>>   };
>>
>>   /**
>> --
>> 2.34.1

[-- Attachment #2: Type: text/html, Size: 16331 bytes --]

  reply	other threads:[~2026-01-19 20:57 UTC|newest]

Thread overview: 6+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2026-01-19 17:15 [PATCH] drm/amdkfd: add mask for debug trap flag setting James Zhu
2026-01-19 19:28 ` Kim, Jonathan
2026-01-19 20:57   ` James Zhu [this message]
2026-01-20 10:34 ` Six, Lancelot
2026-01-20 14:39   ` James Zhu
2026-01-20 14:52     ` Lancelot SIX

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=4012256e-1553-4737-95dc-732be663ff88@amd.com \
    --to=jamesz@amd.com \
    --cc=James.Zhu@amd.com \
    --cc=Jonathan.Kim@amd.com \
    --cc=Lancelot.Six@amd.com \
    --cc=amd-gfx@lists.freedesktop.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox