From: Gustavo Sousa <gustavo.sousa@intel.com>
To: Michal Wajdeczko <michal.wajdeczko@intel.com>,
<intel-xe@lists.freedesktop.org>
Subject: Re: [PATCH 4/6] drm/xe: Separate early xe_device initialization
Date: Mon, 25 May 2026 16:05:50 -0300 [thread overview]
Message-ID: <87tsrvgvht.fsf@intel.com> (raw)
In-Reply-To: <cb7116e6-e713-41e6-a687-8e814ff2dcde@intel.com>
Michal Wajdeczko <michal.wajdeczko@intel.com> writes:
> On 5/25/2026 8:21 PM, Gustavo Sousa wrote:
>> Michal Wajdeczko <michal.wajdeczko@intel.com> writes:
>>
>>> We would like to initialize more of the xe_device struct also from
>>> the kunit code, as it should be safe to use most of the generic drm
>>> or xe components without doing any additional tweaks. Separate early
>>> xe initialization code to a new function, so it can be reused.
>>>
>>> Signed-off-by: Michal Wajdeczko <michal.wajdeczko@intel.com>
>>> ---
>>> drivers/gpu/drm/xe/xe_device.c | 39 ++++++++++++++++++++++++----------
>>> drivers/gpu/drm/xe/xe_device.h | 1 +
>>> 2 files changed, 29 insertions(+), 11 deletions(-)
>>>
>>> diff --git a/drivers/gpu/drm/xe/xe_device.c b/drivers/gpu/drm/xe/xe_device.c
>>> index b498147dcf61..7ba407f73a02 100644
>>> --- a/drivers/gpu/drm/xe/xe_device.c
>>> +++ b/drivers/gpu/drm/xe/xe_device.c
>>> @@ -506,27 +506,45 @@ struct xe_device *xe_device_create(struct pci_dev *pdev)
>>> if (IS_ERR(xe))
>>> return xe;
>>>
>>> + err = xe_device_init_early(xe);
>>> + if (err)
>>> + return ERR_PTR(err);
>>> +
>>> + return xe;
>>> +}
>>> +ALLOW_ERROR_INJECTION(xe_device_create, ERRNO); /* See xe_pci_probe() */
>>> +
>>> +/**
>>> + * xe_device_init_early() - Initialize a new &xe_device instance
>>> + * @xe: the &xe_device to initialize
>>> + *
>>> + * Return: 0 on success or a negative error code on failure.
>>> + */
>>> +int xe_device_init_early(struct xe_device *xe)
>>> +{
>>> + int err;
>>> +
>>> err = ttm_device_init(&xe->ttm, &xe_ttm_funcs, xe->drm.dev,
>>> xe->drm.anon_inode->i_mapping,
>>> xe->drm.vma_offset_manager, 0);
>>> - if (WARN_ON(err))
>>> - return ERR_PTR(err);
>>> + if (err)
>>> + return err;
>>>
>>> xe_bo_dev_init(&xe->bo_device);
>>> err = drmm_add_action_or_reset(&xe->drm, xe_device_destroy, NULL);
>>
>> Curious: are the drmm_* calls going to work fine for the automated
>> teardown when they are called from a kunit context?
>
> that's correct, see:
>
> WARNING: drivers/gpu/drm/xe/tests/xe_kunit_helpers.c:67 at my_drm_fini+0x72/0x80, CPU#0: kunit_try_catch/20
> drm-kunit-mock-device test_data.drm-kunit-mock-device: [drm] WARN_ON(true)
> CPU: 0 UID: 0 PID: 20 Comm: kunit_try_catch Tainted: G W N 7.1.0-rc5-g547691814d7b-dirty #1020 VOLUNTARY
> Tainted: [W]=WARN, [N]=TEST
> Stack:
> 607ffcd5 00000000 00000000 00000001
> ffffff00 607ffcd5 00000009 60029a90
> 6086f348 60064b86 605cf832 a088bcb0
> Call Trace:
> [<60029a90>] ? _printk+0x0/0x65
> [<60064b86>] ? dump_stack_lvl+0x6f/0xb8
> [<605cf832>] ? my_drm_fini+0x72/0x80
> [<600231a3>] ? __warn.cold+0x82/0x1cf
> [<601aacb0>] ? kfree_const+0x0/0x40
> [<60023384>] ? warn_slowpath_fmt+0x94/0xa1
> [<60409500>] ? __drm_dev_dbg+0x0/0xb0
> [<605cf832>] ? my_drm_fini+0x72/0x80
> [<603fb4a7>] ? drm_managed_release+0xa7/0x190
> [<605ec9a0>] ? devres_log+0x0/0xc0
> [<603e6590>] ? devm_drm_dev_init_release+0x60/0x90
> [<605ecb75>] ? release_nodes+0x55/0x80
> [<6069ae80>] ? mutex_lock_nested+0x0/0x20
> [<605ee3f7>] ? devres_release_all+0x97/0xf0
> [<605e68c0>] ? bus_notify+0x0/0x60
> [<60695df0>] ? mutex_unlock+0x0/0x20
> [<605e6f54>] ? device_unbind_cleanup+0x14/0xb0
> [<605e88e6>] ? device_release_driver_internal+0x266/0x2c0
> [<60669f50>] ? kobject_put+0x0/0x150
> [<602ef010>] ? sysfs_remove_file_ns+0x0/0x20
> [<605e64fd>] ? bus_remove_device+0x11d/0x1b0
> [<6069ae80>] ? mutex_lock_nested+0x0/0x20
> [<605dfc1f>] ? device_del+0x1bf/0x5f0
> [<60075330>] ? um_set_signals+0x0/0x60
> [<60075320>] ? um_get_signals+0x0/0x10
> [<606a2160>] ? _raw_spin_lock_irqsave+0x0/0xa0
> [<605e0064>] ? device_unregister+0x14/0x40
> [<60375cde>] ? kunit_remove_resource+0x8e/0xd0
> [<606a24d0>] ? _raw_spin_unlock_irqrestore+0x0/0xb0
> [<60375323>] ? kunit_cleanup+0x63/0xb0
> [<60377a70>] ? kunit_generic_run_threadfn_adapter+0x0/0x30
> [<60377a86>] ? kunit_generic_run_threadfn_adapter+0x16/0x30
>
>
>
>>
>> What about devm_* calls? From a quick look, I know that at least
>> xe_bo_pinned_init() will make one of those.
>
> kunit device follows device model, it is just created on a dummy bus,
> but everything else works fine, see:
>
> WARNING: drivers/gpu/drm/xe/tests/xe_kunit_helpers.c:60 at my_dev_fini+0x6f/0x80, CPU#0: kunit_try_catch/20
> drm-kunit-mock-device test_data.drm-kunit-mock-device: [drm] WARN_ON(true)
> CPU: 0 UID: 0 PID: 20 Comm: kunit_try_catch Tainted: G N 7.1.0-rc5-g547691814d7b-dirty #1020 VOLUNTARY
> Tainted: [N]=TEST
> Stack:
> 607ffcd5 00000000 00000000 00000001
> ffffff00 607ffcd5 00000009 60029a90
> 6086f348 60064b86 605cf7af a088bd00
> Call Trace:
> [<60029a90>] ? _printk+0x0/0x65
> [<60064b86>] ? dump_stack_lvl+0x6f/0xb8
> [<605cf7af>] ? my_dev_fini+0x6f/0x80
> [<600231a3>] ? __warn.cold+0x82/0x1cf
> [<605ee3b5>] ? devres_release_all+0x55/0xf0
> [<60023384>] ? warn_slowpath_fmt+0x94/0xa1
> [<605ec9a0>] ? devres_log+0x0/0xc0
> [<605cf7af>] ? my_dev_fini+0x6f/0x80
> [<605ecb75>] ? release_nodes+0x55/0x80
> [<6069ae80>] ? mutex_lock_nested+0x0/0x20
> [<605ee3f7>] ? devres_release_all+0x97/0xf0
> [<605e68c0>] ? bus_notify+0x0/0x60
> [<60695df0>] ? mutex_unlock+0x0/0x20
> [<605e6f54>] ? device_unbind_cleanup+0x14/0xb0
> [<605e88e6>] ? device_release_driver_internal+0x266/0x2c0
> [<60669f50>] ? kobject_put+0x0/0x150
> [<602ef010>] ? sysfs_remove_file_ns+0x0/0x20
> [<605e64fd>] ? bus_remove_device+0x11d/0x1b0
> [<6069ae80>] ? mutex_lock_nested+0x0/0x20
> [<605dfc1f>] ? device_del+0x1bf/0x5f0
> [<60075330>] ? um_set_signals+0x0/0x60
> [<60075320>] ? um_get_signals+0x0/0x10
> [<606a2160>] ? _raw_spin_lock_irqsave+0x0/0xa0
> [<605e0064>] ? device_unregister+0x14/0x40
> [<60375cde>] ? kunit_remove_resource+0x8e/0xd0
> [<606a24d0>] ? _raw_spin_unlock_irqrestore+0x0/0xb0
> [<60375323>] ? kunit_cleanup+0x63/0xb0
> [<60377a70>] ? kunit_generic_run_threadfn_adapter+0x0/0x30
> [<60377a86>] ? kunit_generic_run_threadfn_adapter+0x16/0x30
>
>>
>>> if (err)
>>> - return ERR_PTR(err);
>>> + return err;
>>>
>>> err = xe_shrinker_create(xe);
>>> if (err)
>>> - return ERR_PTR(err);
>>> + return err;
>>>
>>> xe->atomic_svm_timeslice_ms = 5;
>>> xe->min_run_period_lr_ms = 5;
>>>
>>> err = xe_irq_init(xe);
>>> if (err)
>>> - return ERR_PTR(err);
>>> + return err;
>>>
>>> xe_validation_device_init(&xe->val);
>>>
>>> @@ -536,7 +554,7 @@ struct xe_device *xe_device_create(struct pci_dev *pdev)
>>>
>>> err = xe_pagemap_shrinker_create(xe);
>>> if (err)
>>> - return ERR_PTR(err);
>>> + return err;
>>>
>>> xa_init_flags(&xe->usm.asid_to_vm, XA_FLAGS_ALLOC);
>>>
>>> @@ -555,7 +573,7 @@ struct xe_device *xe_device_create(struct pci_dev *pdev)
>>>
>>> err = xe_bo_pinned_init(xe);
>>> if (err)
>>> - return ERR_PTR(err);
>>> + return err;
>>>
>>> xe->preempt_fence_wq = alloc_ordered_workqueue("xe-preempt-fence-wq",
>>> WQ_MEM_RECLAIM);
>>> @@ -569,16 +587,15 @@ struct xe_device *xe_device_create(struct pci_dev *pdev)
>>> * drmm_add_action_or_reset register above
>>> */
>>> drm_err(&xe->drm, "Failed to allocate xe workqueues\n");
>>> - return ERR_PTR(-ENOMEM);
>>> + return -ENOMEM;
>>> }
>>>
>>> err = drmm_mutex_init(&xe->drm, &xe->pmt.lock);
>>> if (err)
>>> - return ERR_PTR(err);
>>> + return err;
>>>
>>> - return xe;
>>> + return 0;
>>> }
>>> -ALLOW_ERROR_INJECTION(xe_device_create, ERRNO); /* See xe_pci_probe() */
>>
>> So, there are things allocated/initialized in xe_device_create() that
>> get deallocated/finalized in xe_device_destroy(). With this change, we
>> now have things extracted out of xe_device_create() into
>> xe_device_init_early(), and it appears that xe_device_destroy() is now
>> really the counterpart of the latter than the former.
>>
>> I think we should just call the function xe_device_init() and then
>> rename xe_device_destroy() to xe_device_fini().
>
> there is yet another small pending series that introduces drmm_alloc_workqueue
> and after that converts all WQ initializations into self-contained statements
>
> didn't include here, as it touches include/drm so can be done in its own pace
>
> but with that xe_device_destroy (which should never be called that way)
> will be greatly reduced, and even possibly completely removed after moving
> registration of some cleanup steps to xe_bo_dev_init and/or ttm_init
>
>>
>>>
>>> static bool xe_driver_flr_disabled(struct xe_device *xe)
>>> {
>>> diff --git a/drivers/gpu/drm/xe/xe_device.h b/drivers/gpu/drm/xe/xe_device.h
>>> index 27cd2329b99f..975768a6a9c8 100644
>>> --- a/drivers/gpu/drm/xe/xe_device.h
>>> +++ b/drivers/gpu/drm/xe/xe_device.h
>>> @@ -44,6 +44,7 @@ static inline struct xe_device *ttm_to_xe_device(struct ttm_device *ttm)
>>> }
>>>
>>> struct xe_device *xe_device_create(struct pci_dev *pdev);
>>> +int xe_device_init_early(struct xe_device *xe);
>>
>> Right now we do not have any external users for this function. We could
>> keep it static until we get to use them in kunit.
>
> my kunit series just got too big so I've started posting smaller chunks
> don't really want to go into static/public noise just for few days delay
> between series
>
>>
>> That said, I think it would also make sense to separate it completely
>> from xe_device_create(), i.e., have xe_pci.c call the two functions
>> instead of xe_device_init_early() being implicitly called by
>> xe_device_create(). I think that would be more alligned with the
>> function names.
>
> hmm, maybe
>
> but that would likely require also extracting call to
>
> aperture_remove_conflicting_pci_devices
>
> and I'm not ready for that
>
> so IMO it's better to keep "xe_device_create()" as "Allocate and initialize"
> as we have now in the kernel-doc ;)
Fair.
Reviewed-by: Gustavo Sousa <gustavo.sousa@intel.com>
>
>>
>> --
>> Gustavo Sousa
>>
>>> int xe_device_probe_early(struct xe_device *xe);
>>> int xe_device_probe(struct xe_device *xe);
>>> void xe_device_remove(struct xe_device *xe);
>>> --
>>> 2.47.1
next prev parent reply other threads:[~2026-05-25 19:06 UTC|newest]
Thread overview: 30+ messages / expand[flat|nested] mbox.gz Atom feed top
2026-05-25 16:05 [PATCH 0/6] drm/xe: Misc initialization improvements Michal Wajdeczko
2026-05-25 16:05 ` [PATCH 1/6] drm/xe: Drop unused param from xe_device_create() Michal Wajdeczko
2026-05-25 18:13 ` Raag Jadav
2026-05-25 18:45 ` Gustavo Sousa
2026-05-25 16:05 ` [PATCH 2/6] drm/xe: Move xe->info.force_execlist initialization Michal Wajdeczko
2026-05-25 18:46 ` Gustavo Sousa
2026-05-25 16:05 ` [PATCH 3/6] drm/xe: Move xe->info.devid|revid initialization Michal Wajdeczko
2026-05-25 17:47 ` Gustavo Sousa
2026-05-25 18:25 ` Michal Wajdeczko
2026-05-25 19:12 ` Gustavo Sousa
2026-05-25 19:41 ` [PATCH v2 " Michal Wajdeczko
2026-05-25 20:01 ` Gustavo Sousa
2026-05-25 16:05 ` [PATCH 4/6] drm/xe: Separate early xe_device initialization Michal Wajdeczko
2026-05-25 18:21 ` Gustavo Sousa
2026-05-25 18:57 ` Michal Wajdeczko
2026-05-25 19:05 ` Gustavo Sousa [this message]
2026-05-25 19:00 ` Gustavo Sousa
2026-05-25 19:14 ` Michal Wajdeczko
2026-05-25 18:22 ` Raag Jadav
2026-05-25 16:05 ` [PATCH 5/6] drm/xe/pm: Don't access device in init_early() Michal Wajdeczko
2026-05-25 18:30 ` Raag Jadav
2026-05-25 18:47 ` Gustavo Sousa
2026-05-27 15:46 ` Rodrigo Vivi
2026-05-25 16:05 ` [PATCH 6/6] drm/xe/pm: Do early initialization " Michal Wajdeczko
2026-05-25 18:47 ` Gustavo Sousa
2026-05-25 16:36 ` ✓ CI.KUnit: success for drm/xe: Misc initialization improvements Patchwork
2026-05-25 17:42 ` ✓ Xe.CI.BAT: " Patchwork
2026-05-25 19:48 ` ✓ CI.KUnit: success for drm/xe: Misc initialization improvements (rev2) Patchwork
2026-05-25 20:49 ` ✓ Xe.CI.BAT: " Patchwork
2026-05-26 1:59 ` ✓ Xe.CI.FULL: " Patchwork
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=87tsrvgvht.fsf@intel.com \
--to=gustavo.sousa@intel.com \
--cc=intel-xe@lists.freedesktop.org \
--cc=michal.wajdeczko@intel.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox