From: "Lis, Tomasz" <tomasz.lis@intel.com>
To: Michal Wajdeczko <michal.wajdeczko@intel.com>,
<intel-xe@lists.freedesktop.org>
Cc: "Michał Winiarski" <michal.winiarski@intel.com>,
"Piotr Piórkowski" <piotr.piorkowski@intel.com>,
"Matthew Brost" <matthew.brost@intel.com>,
"Lucas De Marchi" <lucas.demarchi@intel.com>
Subject: Re: [PATCH v1 6/7] drm/xe/vf: Rebase MEMIRQ structures for all contexts after migration
Date: Fri, 16 May 2025 00:07:47 +0200 [thread overview]
Message-ID: <64e6641d-7b0d-41a5-87f5-54a03d6bece1@intel.com> (raw)
In-Reply-To: <689654c7-b47c-445d-be19-76671fda4a4a@intel.com>
[-- Attachment #1: Type: text/plain, Size: 5305 bytes --]
On 14.05.2025 22:03, Michal Wajdeczko wrote:
>
> On 14.05.2025 00:49, Tomasz Lis wrote:
>> All contexts require an update of state data, as the data includes
>> GGTT references to memirq-related buffers.
>>
>> Default contexts need these references updated as well, because they
>> are not refreshed when a new context is created from them.
>>
>> Signed-off-by: Tomasz Lis<tomasz.lis@intel.com>
>> ---
>> drivers/gpu/drm/xe/xe_lrc.c | 41 ++++++++++++++++++++++++++++++++
>> drivers/gpu/drm/xe/xe_lrc.h | 2 ++
>> drivers/gpu/drm/xe/xe_sriov_vf.c | 17 +++++++++++--
>> 3 files changed, 58 insertions(+), 2 deletions(-)
>>
>> diff --git a/drivers/gpu/drm/xe/xe_lrc.c b/drivers/gpu/drm/xe/xe_lrc.c
>> index 43e1c18e1769..5a7f0077ef31 100644
>> --- a/drivers/gpu/drm/xe/xe_lrc.c
>> +++ b/drivers/gpu/drm/xe/xe_lrc.c
>> @@ -898,6 +898,47 @@ static void *empty_lrc_data(struct xe_hw_engine *hwe)
>> return data;
>> }
>>
>> +/**
>> + * xe_default_lrc_update_memirq_regs_with_address - Re-compute GGTT references in default LRC
>> + * of given engine.
>> + * @hwe: the &xe_hw_engine struct instance
>> + */
>> +void xe_default_lrc_update_memirq_regs_with_address(struct xe_hw_engine *hwe)
>> +{
>> + struct xe_gt *gt = hwe->gt;
>> + u32 *regs;
>> +
>> + if (!gt->default_lrc[hwe->class])
>> + return;
>> +
>> + regs = gt->default_lrc[hwe->class] + LRC_PPHWSP_SIZE;
>> + set_memory_based_intr(regs, hwe);
>> +}
>> +
>> +/**
>> + * xe_lrc_update_memirq_regs_with_address - Re-compute GGTT references in mem interrupt data
>> + * for given LRC.
>> + * @hwe: the &xe_hw_engine struct instance
>> + * @lrc: the &xe_lrc struct instance
>> + */
>> +void xe_lrc_update_memirq_regs_with_address(struct xe_hw_engine *hwe, struct xe_lrc *lrc)
>> +{
>> + struct xe_gt *gt = hwe->gt;
>> + struct iosys_map map;
>> + size_t regs_len;
>> + u32 *regs;
>> +
>> + map = __xe_lrc_regs_map(lrc);
>> + regs_len = lrc_reg_size(gt_to_xe(gt));
>> + regs = kzalloc(regs_len, GFP_ATOMIC);
>> + if (!regs)
>> + return;
> no error ? but recovery will be now broken, no?
If there is a problem with allocating even 300 bytes, then something
definitely will be broken.
We used `|GFP_ATOMIC`allocation, which in case of quick way failing has
an expensive callback capable of using reserves. If that is failing then
the system must have encountered something really bad. GFX recovery
won't be the only effect.
|
>> + xe_map_memcpy_from(gt_to_xe(gt), regs, &map, 0, regs_len);
>> + set_memory_based_intr(regs, hwe);
>> + xe_map_memcpy_to(gt_to_xe(gt), &map, 0, regs, regs_len);
>> + kfree(regs);
> maybe instead of this alloc + RMW + free just update:
>
> [CTX_INT_MASK_ENABLE_PTR]
> [CTX_INT_STATUS_REPORT_PTR]
> [CTX_INT_SRC_REPORT_PTR]
>
> using 3x xe_lrc_write_ctx_reg() like it was done in patch 5/7 ?
Ok, we have a precedence for that in `xe_lrc_init()` (so we're setting
the values twice there). Avoiding this unnecessary allocation and copy
makes up for slight duplication, plus the final code should still be
shorter. Good suggestion.
-Tomasz
>> +}
>> +
>> static void xe_lrc_set_ppgtt(struct xe_lrc *lrc, struct xe_vm *vm)
>> {
>> u64 desc = xe_vm_pdp4_descriptor(vm, gt_to_tile(lrc->gt));
>> diff --git a/drivers/gpu/drm/xe/xe_lrc.h b/drivers/gpu/drm/xe/xe_lrc.h
>> index e7a99cfd0abe..3f0ae3affafe 100644
>> --- a/drivers/gpu/drm/xe/xe_lrc.h
>> +++ b/drivers/gpu/drm/xe/xe_lrc.h
>> @@ -89,6 +89,8 @@ u32 xe_lrc_indirect_ring_ggtt_addr(struct xe_lrc *lrc);
>> u32 xe_lrc_ggtt_addr(struct xe_lrc *lrc);
>> u32 *xe_lrc_regs(struct xe_lrc *lrc);
>> void xe_lrc_update_hwctx_regs_with_address(struct xe_lrc *lrc);
>> +void xe_default_lrc_update_memirq_regs_with_address(struct xe_hw_engine *hwe);
>> +void xe_lrc_update_memirq_regs_with_address(struct xe_hw_engine *hwe, struct xe_lrc *lrc);
>>
>> u32 xe_lrc_read_ctx_reg(struct xe_lrc *lrc, int reg_nr);
>> void xe_lrc_write_ctx_reg(struct xe_lrc *lrc, int reg_nr, u32 val);
>> diff --git a/drivers/gpu/drm/xe/xe_sriov_vf.c b/drivers/gpu/drm/xe/xe_sriov_vf.c
>> index 016faa29cddd..c08c44dbd383 100644
>> --- a/drivers/gpu/drm/xe/xe_sriov_vf.c
>> +++ b/drivers/gpu/drm/xe/xe_sriov_vf.c
>> @@ -225,12 +225,23 @@ static int vf_post_migration_requery_guc(struct xe_device *xe)
>> return ret;
>> }
>>
>> +static void xe_gt_default_lrcs_hwsp_rebase(struct xe_gt *gt)
>> +{
>> + struct xe_hw_engine *hwe;
>> + enum xe_hw_engine_id id;
>> +
>> + for_each_hw_engine(hwe, gt, id)
>> + xe_default_lrc_update_memirq_regs_with_address(hwe);
>> +}
>> +
>> static void xe_exec_queue_contexts_hwsp_rebase(struct xe_exec_queue *eq)
>> {
>> int i;
>>
>> - for (i = 0; i < eq->width; ++i)
>> + for (i = 0; i < eq->width; ++i) {
>> + xe_lrc_update_memirq_regs_with_address(eq->hwe, eq->lrc[i]);
>> xe_lrc_update_hwctx_regs_with_address(eq->lrc[i]);
>> + }
>> }
>>
>> static void xe_guc_contexts_hwsp_rebase(struct xe_guc *guc)
>> @@ -249,8 +260,10 @@ static void vf_post_migration_fixup_contexts(struct xe_device *xe)
>> struct xe_gt *gt;
>> unsigned int id;
>>
>> - for_each_gt(gt, xe, id)
>> + for_each_gt(gt, xe, id) {
>> + xe_gt_default_lrcs_hwsp_rebase(gt);
>> xe_guc_contexts_hwsp_rebase(>->uc.guc);
>> + }
>> }
>>
>> static void vf_post_migration_fixup_ctb(struct xe_device *xe)
[-- Attachment #2: Type: text/html, Size: 6247 bytes --]
next prev parent reply other threads:[~2025-05-15 22:08 UTC|newest]
Thread overview: 28+ messages / expand[flat|nested] mbox.gz Atom feed top
2025-05-13 22:49 [PATCH v1 0/7] drm/xe/vf: Post-migration recovery of queues and jobs Tomasz Lis
2025-05-13 22:49 ` [PATCH v1 1/7] drm/xe/sa: Avoid caching GGTT address within the manager Tomasz Lis
2025-05-14 16:36 ` Michal Wajdeczko
2025-05-14 18:23 ` Matthew Brost
2025-05-13 22:49 ` [PATCH v1 2/7] drm/xe/vf: Finish RESFIX by reset if CTB not enabled Tomasz Lis
2025-05-14 17:23 ` Michal Wajdeczko
2025-05-14 23:27 ` Lis, Tomasz
2025-05-13 22:49 ` [PATCH v1 3/7] drm/xe/vf: Pause submissions during RESFIX fixups Tomasz Lis
2025-05-14 18:06 ` Michal Wajdeczko
2025-05-15 12:56 ` Lis, Tomasz
2025-05-13 22:49 ` [PATCH v1 4/7] drm/xe: Block reset while recovering from VF migration Tomasz Lis
2025-05-13 22:49 ` [PATCH v1 5/7] drm/xe/vf: Rebase HWSP of all contexts after migration Tomasz Lis
2025-05-14 18:37 ` Michal Wajdeczko
2025-05-15 22:07 ` Lis, Tomasz
2025-05-13 22:49 ` [PATCH v1 6/7] drm/xe/vf: Rebase MEMIRQ structures for " Tomasz Lis
2025-05-14 20:03 ` Michal Wajdeczko
2025-05-15 22:07 ` Lis, Tomasz [this message]
2025-05-13 22:49 ` [PATCH v1 7/7] drm/xe/vf: Post migration, repopulate ring area for pending request Tomasz Lis
2025-05-14 18:49 ` Michal Wajdeczko
2025-05-15 22:08 ` Lis, Tomasz
2025-05-14 20:04 ` ✓ CI.Patch_applied: success for drm/xe/vf: Post-migration recovery of queues and jobs Patchwork
2025-05-14 20:04 ` ✗ CI.checkpatch: warning " Patchwork
2025-05-14 20:06 ` ✓ CI.KUnit: success " Patchwork
2025-05-14 20:16 ` ✓ CI.Build: " Patchwork
2025-05-14 20:19 ` ✓ CI.Hooks: " Patchwork
2025-05-14 20:20 ` ✓ CI.checksparse: " Patchwork
2025-05-14 21:11 ` ✓ Xe.CI.BAT: " Patchwork
2025-05-15 4:52 ` ✗ Xe.CI.Full: failure " Patchwork
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=64e6641d-7b0d-41a5-87f5-54a03d6bece1@intel.com \
--to=tomasz.lis@intel.com \
--cc=intel-xe@lists.freedesktop.org \
--cc=lucas.demarchi@intel.com \
--cc=matthew.brost@intel.com \
--cc=michal.wajdeczko@intel.com \
--cc=michal.winiarski@intel.com \
--cc=piotr.piorkowski@intel.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox