Intel-XE Archive on lore.kernel.org
 help / color / mirror / Atom feed
From: Michal Wajdeczko <michal.wajdeczko@intel.com>
To: Daniele Ceraolo Spurio <daniele.ceraolospurio@intel.com>,
	<intel-xe@lists.freedesktop.org>
Subject: Re: [PATCH 06/10] drm/xe/sriov: Add debugfs with scheduler groups information
Date: Tue, 2 Dec 2025 22:31:42 +0100	[thread overview]
Message-ID: <f4dc11d9-442f-48e1-806c-1451dd4fced6@intel.com> (raw)
In-Reply-To: <9993fbe8-fd20-48cf-b82d-b0c070c1d5a0@intel.com>



On 12/2/2025 7:20 PM, Daniele Ceraolo Spurio wrote:
> 
> 
> On 12/2/2025 8:24 AM, Michal Wajdeczko wrote:
>>
>> On 11/27/2025 2:45 AM, Daniele Ceraolo Spurio wrote:
>>> When schedulers groups are enabled, we dynamically create new debugfs
>>> folders under the GT folder for the PF and each VF. The aim is to have
>> hmm, but since this is debugfs and groups (even uninitialized) still exist
>> then maybe it is easier (and safer) to create static list of groups (8 or 2)
>> and just print their configurations: either empty (if uninitialized) or with
>> engine names (after selecting predefined EGS mode)
> 
> ok
> 
>>
>> then debugfs files (not directories) will reflect currently used configuration
> 
> See below about the directories
> 
>>
>>> all the info for each VF under the sched_groups folder, with individual
>>> folders per-group under it. Right now the only info is the engine list,
>>> but follow up patches will add execution quantum and preemption timeout
>>> as well (which are configurable per-group-per-VF).
>> but groups EQ/PT configuration is not per-group, but only per-GT (GuC)
>> so it should be OK to have them level up
> 
> ok
> 
>>
>>> Note that the engine lists are not configurable per-VF and therefore not
>>> strictly needed in the VF folders. However, it is still useful to have
>>> them there because an admin might want to check the engine list while
>>> configuring the EQ/PT values for a specific VF.
>> duplicating RO entries per-VF seems odd, maybe it's better to use symlinks:
>>
>>
>> /sys/kernel/debug/dri/BDF/
>> ├── sriov
>> :   ├── pf
>>      :   ├── tile0
>>          :   ├── gt0
>>              :   ├── sched_groups_mode            # disabled [media_slices]
>>                  ├── sched_groups_exec_quantums_ms    # 20 20
>>                  ├── sched_groups_preempt_timeouts_us    # 2000 2000
>>                  ├── sched_groups
>>                  :   ├── group0                # vcs0 vcs1 vecs0
>>                  :   ├── group1                # vcs2 vcs3 vecs1
>>                  :   ├── group2                #
>>                      :
>>                      └── group7
>>      ├── vf1
>>      :   ├── tile0
>>          :   ├── gt0
>>                  ├── sched_groups_exec_quantums_ms    # 30 30
>>                  ├── sched_groups_preempt_timeouts_us    # 3000 3000
>>                  ├── sched_groups --> ../../../pf/tile0/gt0/sched_groups
>>
>>      ├── vf2
>>      :   ├── tile0
>>          :   ├── gt0
>>                  ├── sched_groups_exec_quantums_ms    # 40 40
>>                  ├── sched_groups_preempt_timeouts_us    # 4000 4000
>>                  ├── sched_groups --> ../../../pf/tile0/gt0/sched_groups
> 
> My idea here was to have exec quantum and preempt timeout configurable for a single group, so the end result would be something like this for PF and each VF:
> 
>         :   ├── gt1
>         :   ├── sched_groups
>             :   ├── group0
>             :   ├── engines
>                 ├── exec_quantum_ms
>                 └── preempt_timeout_us
>             :
>             └── groupN
>                 ├── engines
>                 ├── exec_quantum_ms
>                 └── preempt_timeout_us
> 
> 
> Which only works if groups are folders (which is also why I can't symlink the whole sched_groups folder). If you think that's overkill

yes, since GuC does not expose it, so why bother?

this whole EGS feature requires some experienced admin who should already know configs for all groups

> and that we should just keep the option to set all the groups at once then I can do that and not have folders. I can also just drop the engine lists from VF debufs, my idea here was that if an admin is setting the EQ/PT e.g. for VF1 group 1 from within the VF debugfs folder it might be useful to have a way to check which engines are in the group without having to go back to the PF folder, but it's not really an essential thing.

this useful hint how the groups look like can be done by symlink (which is also to some extend overkill, since it points to well know location)

my point is that we should expose only minimal set of files that are actually needed to exposes all required information

creating files with redundant info will just cause more test code to be developed ;)

> 
> 
>>
>>> Signed-off-by: Daniele Ceraolo Spurio <daniele.ceraolospurio@intel.com>
>>> Cc: Michal Wajdeczko <michal.wajdeczko@intel.com>
>>> ---
>>>   drivers/gpu/drm/xe/xe_gt_sriov_pf_debugfs.c | 140 +++++++++++++++++++-
>>>   drivers/gpu/drm/xe/xe_gt_sriov_pf_types.h   |   4 +
>>>   2 files changed, 143 insertions(+), 1 deletion(-)
>>>
>>> diff --git a/drivers/gpu/drm/xe/xe_gt_sriov_pf_debugfs.c b/drivers/gpu/drm/xe/xe_gt_sriov_pf_debugfs.c
>>> index 2953ef21a5ad..947e2b92d58a 100644
>>> --- a/drivers/gpu/drm/xe/xe_gt_sriov_pf_debugfs.c
>>> +++ b/drivers/gpu/drm/xe/xe_gt_sriov_pf_debugfs.c
>>> @@ -5,6 +5,7 @@
>>>     #include <linux/debugfs.h>
>>>   +#include <drm/drm_managed.h>
>>>   #include <drm/drm_print.h>
>>>   #include <drm/drm_debugfs.h>
>>>   @@ -21,6 +22,7 @@
>>>   #include "xe_gt_sriov_pf_monitor.h"
>>>   #include "xe_gt_sriov_pf_policy.h"
>>>   #include "xe_gt_sriov_pf_service.h"
>>> +#include "xe_guc.h"
>>>   #include "xe_pm.h"
>>>   #include "xe_sriov_pf.h"
>>>   #include "xe_sriov_pf_provision.h"
>>> @@ -162,8 +164,127 @@ static void pf_add_policy_attrs(struct xe_gt *gt, struct dentry *parent)
>>>    *          :   ├── tile0
>>>    *              :   ├── gt0
>>>    *                  :   ├── sched_groups_mode
>>> + *                      ├── sched_groups
>>> + *                      :   ├── group0
>>> + *                          :   └── engines
>>> + *                          :
>>> + *                          └── groupN
>>> + *          :                   └── engines
>>> + *          ├── vf1
>>> + *          :   ├── tile0
>>> + *              :   ├── gt0
>>> + *                  :   ├── sched_groups
>>> + *                      :   ├── group0
>>> + *                          :   └── engines
>>>    */
>>>   +struct sched_group_info {
>>> +    struct xe_gt *gt;
>> we should be able to get gt and vfid from parents d_inode->i_private
> 
> In this implementation the parent of the engines folder is groupX, and that one doesn't have that info in d_inode->i_private. Might change if we drop the subfolders.

but grand-parent or grand-grand parent will eventually have expected priv data (and since we know the structure we should know how deep to search)

> 
> Daniele
> 
>>
>>> +    u32 *masks;
>>> +};
>>> +
>>> +static int sched_group_engines_info(struct seq_file *m, void *data)
>>> +{
>>> +    struct drm_printer p = drm_seq_file_printer(m);
>>> +    struct sched_group_info *gi = m->private;
>>> +    struct xe_gt *gt = gi->gt;
>>> +    struct xe_hw_engine *hwe;
>>> +    enum xe_hw_engine_id id;
>>> +    bool first = true;
>>> +
>>> +    for_each_hw_engine(hwe, gt, id) {
>>> +        u8 guc_class = xe_engine_class_to_guc_class(hwe->class);
>>> +        u32 mask = gi->masks[guc_class];
>>> +
>>> +        if (mask & BIT(hwe->logical_instance)) {
>>> +            drm_printf(&p, "%s%s", first ? "" : " ", hwe->name);
>>> +
>>> +            first = false;
>>> +        }
>>> +    }
>>> +
>>> +    drm_printf(&p, "\n");
>>> +
>>> +    return 0;
>>> +}
>>> +
>>> +static int sched_group_engines_open(struct inode *inode, struct file *file)
>>> +{
>>> +    return single_open(file, sched_group_engines_info, inode->i_private);
>>> +}
>>> +
>>> +static const struct file_operations sched_group_engines_fops = {
>>> +    .owner = THIS_MODULE,
>>> +    .open = sched_group_engines_open,
>>> +    .read = seq_read,
>>> +    .llseek = seq_lseek,
>>> +    .release = single_release,
>>> +};
>>> +
>>> +static void __sched_group_info_cleanup(struct xe_gt *gt, struct dentry *dent)
>>> +{
>>> +    if (dent->d_inode->i_private)
>>> +        drmm_kfree(&gt_to_xe(gt)->drm, dent->d_inode->i_private);
>>> +
>>> +    debugfs_remove_recursive(dent);
>>> +}
>>> +
>>> +#define GROUP_INFO_ROOT "sched_groups"
>>> +static void sched_group_info_register(struct xe_gt *gt, unsigned int vfid)
>>> +{
>>> +    struct xe_gt_sriov_pf_policy *policy = &gt->sriov.pf.policy;
>>> +    u32 mode = policy->guc.sched_groups.current_mode;
>>> +    u8 num_groups = policy->guc.sched_groups.modes[mode].num_masks / GUC_MAX_ENGINE_CLASSES;
>>> +    struct sched_group_info *infos;
>>> +    struct dentry *parent;
>>> +    struct dentry *old;
>>> +    u8 g;
>>> +
>>> +    if (!gt->sriov.pf.debugfs_roots)
>>> +        return;
>>> +
>>> +    /* remove existing debugfs entries for old groups */
>>> +    old = debugfs_lookup(GROUP_INFO_ROOT, gt->sriov.pf.debugfs_roots[vfid]);
>>> +    if (old)
>>> +        __sched_group_info_cleanup(gt, old);
>>> +
>>> +    /* re-create debugfs for new groups (if any)*/
>>> +    if (!num_groups)
>>> +        return;
>>> +
>>> +    parent = debugfs_create_dir(GROUP_INFO_ROOT, gt->sriov.pf.debugfs_roots[vfid]);
>>> +    if (IS_ERR(parent))
>>> +        return;
>>> +
>>> +    infos = drmm_kzalloc(&gt_to_xe(gt)->drm, sizeof(*infos) * num_groups, GFP_KERNEL);
>>> +    if (!infos)
>>> +        goto out_err;
>>> +    parent->d_inode->i_private = infos;
>>> +
>>> +    for (g = 0; g < num_groups; g++) {
>>> +        struct sched_group_info *info = &infos[g];
>>> +        u32 base = g * GUC_MAX_ENGINE_CLASSES;
>>> +        struct dentry *dent;
>>> +        char name[10];
>>> +
>>> +        snprintf(name, sizeof(name), "group%u", g);
>>> +        dent = debugfs_create_dir(name, parent);
>>> +        if (IS_ERR(dent))
>>> +            goto out_err;
>>> +
>>> +        info->gt = gt;
>>> +        info->masks = &policy->guc.sched_groups.modes[mode].masks[base];
>>> +
>>> +        dent->d_inode->i_private = info;
>>> +        debugfs_create_file("engines", 0644, dent, info, &sched_group_engines_fops);
>>> +    }
>>> +
>>> +    return;
>>> +
>>> +out_err:
>>> +    __sched_group_info_cleanup(gt, parent);
>>> +}
>>> +
>>>   static const char *sched_group_mode_to_string(enum xe_sriov_sched_group_modes mode)
>>>   {
>>>       switch (mode) {
>>> @@ -220,7 +341,7 @@ static ssize_t sched_groups_write(struct file *file, const char __user *ubuf,
>>>       struct xe_gt *gt = extract_gt(file_inode(file)->i_private);
>>>       char name[32];
>>>       int ret;
>>> -    int m;
>>> +    int m, i;
>>>         if (*pos)
>>>           return -ESPIPE;
>>> @@ -250,6 +371,10 @@ static ssize_t sched_groups_write(struct file *file, const char __user *ubuf,
>>>       ret = xe_gt_sriov_pf_policy_set_sched_groups_mode(gt, m);
>>>       xe_pm_runtime_put(gt_to_xe(gt));
>>>   +    if (!ret)
>>> +        for (i = 0; i <= xe_sriov_pf_get_totalvfs(gt_to_xe(gt)); i++)
>>> +            sched_group_info_register(gt, i);
>>> +
>>>       return (ret < 0) ? ret : size;
>>>   }
>>>   @@ -693,6 +818,19 @@ void xe_gt_sriov_pf_debugfs_populate(struct xe_gt *gt, struct dentry *parent, un
>>>           return;
>>>       dent->d_inode->i_private = gt;
>>>   +    /*
>>> +     * we allocate an array to store the GT-level dentries for PF and all
>>> +     * VFs when creating the PF folder. Failing to create this allocation
>>> +     * is not fatal.
>>> +     */
>>> +    if (vfid == 0)
>>> +        gt->sriov.pf.debugfs_roots =
>>> +            drmm_kcalloc(&gt_to_xe(gt)->drm,
>>> +                     1 + xe_sriov_pf_get_totalvfs(gt_to_xe(gt)),
>>> +                     sizeof(struct dentry *), GFP_KERNEL);
>>> +    if (gt->sriov.pf.debugfs_roots)
>>> +        gt->sriov.pf.debugfs_roots[vfid] = dent;
>> for per-VF data we have struct xe_gt_sriov_metadata that can be accessed from
>> struct xe_gt_sriov_pf
>>
>> but maybe it shouldn't be needed at all if we create static egs files as
>> suggested above
>>
>>
>>> +
>>>       xe_gt_assert(gt, extract_gt(dent) == gt);
>>>       xe_gt_assert(gt, extract_vfid(dent) == vfid);
>>>   diff --git a/drivers/gpu/drm/xe/xe_gt_sriov_pf_types.h b/drivers/gpu/drm/xe/xe_gt_sriov_pf_types.h
>>> index 667b8310478d..747ec5dae652 100644
>>> --- a/drivers/gpu/drm/xe/xe_gt_sriov_pf_types.h
>>> +++ b/drivers/gpu/drm/xe/xe_gt_sriov_pf_types.h
>>> @@ -15,6 +15,8 @@
>>>   #include "xe_gt_sriov_pf_policy_types.h"
>>>   #include "xe_gt_sriov_pf_service_types.h"
>>>   +struct dentry;
>>> +
>>>   /**
>>>    * struct xe_gt_sriov_metadata - GT level per-VF metadata.
>>>    */
>>> @@ -52,6 +54,7 @@ struct xe_gt_sriov_pf_workers {
>>>    * @migration: migration data.
>>>    * @spare: PF-only provisioning configuration.
>>>    * @vfs: metadata for all VFs.
>>> + * @debugfs_root: GT debugfs root for all VFs.
>>>    */
>>>   struct xe_gt_sriov_pf {
>>>       struct xe_gt_sriov_pf_workers workers;
>>> @@ -60,6 +63,7 @@ struct xe_gt_sriov_pf {
>>>       struct xe_gt_sriov_pf_policy policy;
>>>       struct xe_gt_sriov_spare_config spare;
>>>       struct xe_gt_sriov_metadata *vfs;
>>> +    struct dentry **debugfs_roots;
>>>   };
>>>     #endif
> 


  reply	other threads:[~2025-12-02 21:32 UTC|newest]

Thread overview: 44+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2025-11-27  1:45 [PATCH 00/10] Introduce SRIOV scheduler groups Daniele Ceraolo Spurio
2025-11-27  1:45 ` [PATCH 01/10] drm/xe/gt: Add engine masks for each class Daniele Ceraolo Spurio
2025-12-01 16:52   ` Michal Wajdeczko
2025-11-27  1:45 ` [PATCH 02/10] drm/xe/sriov: Initialize scheduler groups Daniele Ceraolo Spurio
2025-12-01 22:37   ` Michal Wajdeczko
2025-12-01 23:33     ` Daniele Ceraolo Spurio
2025-12-02 21:08       ` Michal Wajdeczko
2025-12-02 23:02         ` Daniele Ceraolo Spurio
2025-12-03  1:15         ` Daniele Ceraolo Spurio
2025-11-27  1:45 ` [PATCH 03/10] drm/xe/sriov: Add support for enabling " Daniele Ceraolo Spurio
2025-12-02 11:49   ` Michal Wajdeczko
2025-12-02 17:39     ` Daniele Ceraolo Spurio
2025-12-04 22:06       ` Daniele Ceraolo Spurio
2025-11-27  1:45 ` [PATCH 04/10] drm/xe/sriov: Scheduler groups are incompatible with multi-lrc Daniele Ceraolo Spurio
2025-12-02 13:32   ` Michal Wajdeczko
2025-12-02 17:57     ` Daniele Ceraolo Spurio
2025-12-02 21:17       ` Michal Wajdeczko
2025-12-02 21:25         ` Daniele Ceraolo Spurio
2025-12-02 21:37           ` Michal Wajdeczko
2025-12-02 21:42             ` Daniele Ceraolo Spurio
2025-11-27  1:45 ` [PATCH 05/10] drm/xe/sriov: Add debugfs to enable scheduler groups Daniele Ceraolo Spurio
2025-12-02 15:52   ` Michal Wajdeczko
2025-12-02 18:03     ` Daniele Ceraolo Spurio
2025-12-02 21:24       ` Michal Wajdeczko
2025-11-27  1:45 ` [PATCH 06/10] drm/xe/sriov: Add debugfs with scheduler groups information Daniele Ceraolo Spurio
2025-12-02 16:24   ` Michal Wajdeczko
2025-12-02 18:20     ` Daniele Ceraolo Spurio
2025-12-02 21:31       ` Michal Wajdeczko [this message]
2025-11-27  1:45 ` [PATCH 07/10] drm/xe/sriov: Prep for multiple exec quantums and preemption timeouts Daniele Ceraolo Spurio
2025-12-02 16:42   ` Michal Wajdeczko
2025-12-06  1:55     ` Daniele Ceraolo Spurio
2025-11-27  1:45 ` [PATCH 08/10] drm/xe/sriov: Add functions to set exec quantums for each group Daniele Ceraolo Spurio
2025-12-02 19:54   ` Michal Wajdeczko
2025-12-06  1:58     ` Daniele Ceraolo Spurio
2025-11-27  1:45 ` [PATCH 09/10] drm/xe/sriov: Add functions to set preempt timeouts " Daniele Ceraolo Spurio
2025-12-02 20:01   ` Michal Wajdeczko
2025-11-27  1:45 ` [PATCH 10/10] drm/xe/sriov: Add debugfs to set EQ and PT for scheduler groups Daniele Ceraolo Spurio
2025-12-02 20:17   ` Michal Wajdeczko
2025-12-06  1:53     ` Daniele Ceraolo Spurio
2025-11-27  1:51 ` ✗ CI.checkpatch: warning for Introduce SRIOV " Patchwork
2025-11-27  1:52 ` ✓ CI.KUnit: success " Patchwork
2025-11-27  2:36 ` ✗ Xe.CI.BAT: failure " Patchwork
2025-11-27  3:18 ` ✗ Xe.CI.Full: " Patchwork
2025-12-01 17:46   ` Daniele Ceraolo Spurio

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=f4dc11d9-442f-48e1-806c-1451dd4fced6@intel.com \
    --to=michal.wajdeczko@intel.com \
    --cc=daniele.ceraolospurio@intel.com \
    --cc=intel-xe@lists.freedesktop.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox