From: "Nilawar, Badal" <badal.nilawar@intel.com>
To: "Ghimiray, Himal Prasad" <himal.prasad.ghimiray@intel.com>,
Rodrigo Vivi <rodrigo.vivi@intel.com>
Cc: <intel-xe@lists.freedesktop.org>,
Matthew Brost <matthew.brost@intel.com>,
Lucas De Marchi <lucas.demarchi@intel.com>
Subject: Re: [RFC 7/9] drm/xe/gt_tlb_invalidation_ggtt: Call xe_force_wake_put if xe_force_wake_get succeds
Date: Tue, 10 Sep 2024 20:07:01 +0530 [thread overview]
Message-ID: <122fbba9-8174-4c91-8085-5f5cc940db17@intel.com> (raw)
In-Reply-To: <43116f22-0495-44ec-9895-aad9dcd5165d@intel.com>
On 09-09-2024 14:59, Ghimiray, Himal Prasad wrote:
>
>
> On 06-09-2024 21:59, Rodrigo Vivi wrote:
>> On Fri, Sep 06, 2024 at 01:21:41AM +0530, Ghimiray, Himal Prasad wrote:
>>>
>>>
>>> On 06-09-2024 01:07, Rodrigo Vivi wrote:
>>>> On Fri, Aug 30, 2024 at 10:53:24AM +0530, Himal Prasad Ghimiray wrote:
>>>>> A failure in xe_force_wake_get() no longer increments the domain's
>>>>> refcount, so xe_force_wake_put() should not be called in such cases
>>>>>
>>>>> Cc: Matthew Brost <matthew.brost@intel.com>
>>>>> Cc: Rodrigo Vivi <rodrigo.vivi@intel.com>
>>>>> Cc: Lucas De Marchi <lucas.demarchi@intel.com>
>>>>> Signed-off-by: Himal Prasad Ghimiray <himal.prasad.ghimiray@intel.com>
>>>>> ---
>>>>> drivers/gpu/drm/xe/xe_gt_tlb_invalidation.c | 9 ++++++---
>>>>> 1 file changed, 6 insertions(+), 3 deletions(-)
>>>>>
>>>>> diff --git a/drivers/gpu/drm/xe/xe_gt_tlb_invalidation.c
>>>>> b/drivers/gpu/drm/xe/xe_gt_tlb_invalidation.c
>>>>> index cca9cf536f76..3f86ab704c4f 100644
>>>>> --- a/drivers/gpu/drm/xe/xe_gt_tlb_invalidation.c
>>>>> +++ b/drivers/gpu/drm/xe/xe_gt_tlb_invalidation.c
>>>>> @@ -259,11 +259,11 @@ static int xe_gt_tlb_invalidation_guc(struct
>>>>> xe_gt *gt,
>>>>> int xe_gt_tlb_invalidation_ggtt(struct xe_gt *gt)
>>>>> {
>>>>> struct xe_device *xe = gt_to_xe(gt);
>>>>> + int ret;
>>>>> if (xe_guc_ct_enabled(>->uc.guc.ct) &&
>>>>> gt->uc.guc.submission_state.enabled) {
>>>>> struct xe_gt_tlb_invalidation_fence fence;
>>>>> - int ret;
>>>>> xe_gt_tlb_invalidation_fence_init(gt, &fence, true);
>>>>> ret = xe_gt_tlb_invalidation_guc(gt, &fence);
>>>>> @@ -277,7 +277,9 @@ int xe_gt_tlb_invalidation_ggtt(struct xe_gt *gt)
>>>>> if (IS_SRIOV_VF(xe))
>>>>> return 0;
>>>>> - xe_gt_WARN_ON(gt, xe_force_wake_get(gt_to_fw(gt), XE_FW_GT));
>>>>> + ret = xe_force_wake_get(gt_to_fw(gt), XE_FW_GT);
>>>>> + xe_gt_WARN_ON(gt, ret);
>>>>> +
>>>>> if (xe->info.platform == XE_PVC || GRAPHICS_VER(xe) >=
>>>>> 20) {
>>>>> xe_mmio_write32(gt, PVC_GUC_TLB_INV_DESC1,
>>>>> PVC_GUC_TLB_INV_DESC1_INVALIDATE);
>>>>> @@ -287,7 +289,8 @@ int xe_gt_tlb_invalidation_ggtt(struct xe_gt *gt)
>>>>> xe_mmio_write32(gt, GUC_TLB_INV_CR,
>>>>> GUC_TLB_INV_CR_INVALIDATE);
>>>>> }
>>>>> - xe_force_wake_put(gt_to_fw(gt), XE_FW_GT);
>>>>> + if (!ret)
>>>>> + xe_force_wake_put(gt_to_fw(gt), XE_FW_GT);
>>>>
>>>> looking all these cases now I honestly prefer the other way around.
>>>>
>>>> If we called the get, we call the put.
>>>> get always increase the reference and put does the clean-up.
>>>>
>>>> fw_ref = xe_force_wake_get(gt_to_fw(gt), XE_FW_GT);
>>>>
>>>> xe_force_wake_put(gt_to_fw(gt), fw_ref);
>>>>
>>>> so, the fw_ref is a mask of the woken up cases which require
>>>> the ref drop and sleep call.
>>>
>>> Hi Rodrigo,
>>>
>>> Thanks for the input. AFAIU using this approach creates issue in the
>>> subsequent force_wake_get/put in callee function. Which I have tried to
>>> explain in cover letter.
>>>
>>> [1] subsequent forcewake call by callee function assumes domains are
>>> already awake, which might not be true. This shows perfectly balanced
>>> xe_force_wake_get/_put can also cause problem.
>>>
>>> [1] func_a() {
>>> XE_WARN(xe_force_wake_get()) <---> fails but increments refcount
>>>
>>> func_b();
>>>
>>> XE_WARN(xe_force_wake_put());<---> decrements refcounts
>>> }
>>>
>>> func_b() {
>>> if(xe_force_wake_get()) <---> succeeds due to refcount of caller
>>> return;
>>>
>>> does mmio_operations(); <---> Domain might not be awake
>>>
>>> xe_force_wake_put(); <---> decrement refcount
>>> }
>>
>> Well, to be honest, this is what bugs me in this whole series.
>>
>> If func_a failed, why would function b succeed? It that's the
>> case should we include more redundancy and retries so the
>> func_a would succeed like the func_b is expected in your
>> scenario?
>
>
> Hi Rodrigo,
>
> This is current behavior, which patch [1] resolves. I misunderstood your
> comment as dropping of that patch and simply balancing all _gets with
> respective _puts.
>
>
>>
>> But other then that, I'm afraid that you didn't fully understand
>> my idea. Sorry for not being clear.
>>
>> My thought is, you do what you are doing in this series.
>> If the get doesn't succeed you drop the ref count and call the
>> disable.
>
>
> OK. IMO, just reducing refcount is better for failing domain and not to
> disable it explicitly
>
>
>>
>> The return of the get is just for the domains that have succeeded.
>> then the put returns only the ones that had succeeded.
>> The function B will then try to wake-up whatever had failed in
>> func_a.
>
> I assumw with this, the return of xe_force_wake_get will return the
> mask, hence the caller will need to verify whether the returned mask is
> correct or failed.
>
>
>>
>> Something like:
>>
>>
>> func_a() {
>> fw_ref = xe_force_wake_get(ALL_DOMAINS) <---> fails GT-domain but
>> return a mask with all the domains except GT.
>>
>> XE_WARN(!fw_ref);
>
>
> XE_WARN(!fw_ref); will work for all individual domains but not ALL_DOMAINS
>
> XE_WARN(fw_ref != ALL_DOMAINS); <-- If user wants to continue -->
>
> if (fw_ref != ALL_DOMAINS) <--If user wants to return on failure -->
> xe_force_wake_put(fw_ref); <-- ensure to put awake domain -->
>
> return;
> }
>
>
>>
>> func_b();
>>
>> XE_WARN(xe_force_wake_put(fw_ref));<---> decrements refcounts of
>> the domains which were actually woken up.
>
> Makes sense.
>
>> }
>>
>> func_b() {
>> fw_ref = xe_force_wake_get(GT_DOMAIN);
>> if(fw_ref & GT_DOMAIN) <---> likely fail anyway since func_a has
>> failed, but it at least tries it out because you have handled it in
>> your series...
>> return;
>>
>> does mmio_operations(); <---> Domain might not be awake
>>
>> xe_force_wake_put(fw_ref); <---> decrement refcount of the domains
>> you woked up.
>> }
>>
>> does it make sense now?
>
>
> Yes, this is indeed a much better approach for FORCEWAKE_ALL. Thank you
> for the suggestion. To summarize, rather than disabling the successfully
> awakened domain in the event of a failure, we will use forcewake_put to
> handle the disabling of them and user will decide when to call it.
This way of implementing looks ok to me. Only concern is what if the
func_b() calls xe_force_wake_assert_held(), this will raise the assert
as it will not find expected domain awake. This doesn't align the idea
of continuing in case of ack failure. IMO user decide to continue even
after set ack failure by assuming domain woken up but ack didn't arrive
in time.
Regards,
Badal
>
>
>>
>>>
>>> BR
>>> Himal
>>>
>>>>
>>>>> }
>>>>> return 0;
>>>>> --
>>>>> 2.34.1
>>>>>
next prev parent reply other threads:[~2024-09-10 14:37 UTC|newest]
Thread overview: 34+ messages / expand[flat|nested] mbox.gz Atom feed top
2024-08-30 5:23 [RFC 0/9] Fix xe_force_wake_get() failure handling Himal Prasad Ghimiray
2024-08-30 5:18 ` ✓ CI.Patch_applied: success for " Patchwork
2024-08-30 5:18 ` ✓ CI.checkpatch: " Patchwork
2024-08-30 5:19 ` ✓ CI.KUnit: " Patchwork
2024-08-30 5:23 ` [RFC 1/9] drm/xe: Error handling in xe_force_wake_get() Himal Prasad Ghimiray
2024-08-30 6:37 ` Jani Nikula
2024-08-30 6:45 ` Ghimiray, Himal Prasad
2024-09-05 19:29 ` Rodrigo Vivi
2024-09-05 20:02 ` Ghimiray, Himal Prasad
2024-09-06 16:18 ` Rodrigo Vivi
2024-09-10 18:27 ` Nilawar, Badal
2024-09-11 6:51 ` Ghimiray, Himal Prasad
2024-09-11 6:40 ` Upadhyay, Tejas
2024-08-30 5:23 ` [RFC 2/9] drm/xe: Ensure __must_check for xe_force_wake_get() return Himal Prasad Ghimiray
2024-09-05 19:30 ` Rodrigo Vivi
2024-08-30 5:23 ` [RFC 3/9] drm/xe/gsc: call xe_force_wake_put() only if xe_force_wake_get() succeeds Himal Prasad Ghimiray
2024-08-30 5:23 ` [RFC 4/9] drm/xe/gt: " Himal Prasad Ghimiray
2024-08-30 5:23 ` [RFC 5/9] drm/xe/guc: " Himal Prasad Ghimiray
2024-08-30 5:23 ` [RFC 6/9] drm/xe/oa: Handle force_wake_get failure in xe_oa_stream_init() Himal Prasad Ghimiray
2024-08-30 5:23 ` [RFC 7/9] drm/xe/gt_tlb_invalidation_ggtt: Call xe_force_wake_put if xe_force_wake_get succeds Himal Prasad Ghimiray
2024-09-05 19:37 ` Rodrigo Vivi
2024-09-05 19:51 ` Ghimiray, Himal Prasad
2024-09-06 16:29 ` Rodrigo Vivi
2024-09-09 9:29 ` Ghimiray, Himal Prasad
2024-09-10 14:37 ` Nilawar, Badal [this message]
2024-09-10 17:39 ` Rodrigo Vivi
2024-09-10 17:53 ` Nilawar, Badal
2024-08-30 5:23 ` [RFC 8/9] drm/xe: Change return type to void for xe_force_wake_put Himal Prasad Ghimiray
2024-08-30 5:23 ` [RFC 9/9] drm/xe: forcewake debugfs open fails on xe_forcewake_get failure Himal Prasad Ghimiray
2024-08-30 5:32 ` ✓ CI.Build: success for Fix xe_force_wake_get() failure handling Patchwork
2024-08-30 5:37 ` ✓ CI.Hooks: " Patchwork
2024-08-30 5:42 ` ✓ CI.checksparse: " Patchwork
2024-08-30 6:05 ` ✓ CI.BAT: " Patchwork
2024-08-30 17:41 ` ✓ CI.FULL: " Patchwork
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=122fbba9-8174-4c91-8085-5f5cc940db17@intel.com \
--to=badal.nilawar@intel.com \
--cc=himal.prasad.ghimiray@intel.com \
--cc=intel-xe@lists.freedesktop.org \
--cc=lucas.demarchi@intel.com \
--cc=matthew.brost@intel.com \
--cc=rodrigo.vivi@intel.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox