Intel-XE Archive on lore.kernel.org
 help / color / mirror / Atom feed
From: "Gupta, Anshuman" <anshuman.gupta@intel.com>
To: Matthew Auld <matthew.auld@intel.com>, <intel-xe@lists.freedesktop.org>
Cc: Rodrigo Vivi <rodrigo.vivi@intel.com>
Subject: Re: [Intel-xe] [PATCH v12 01/13] drm/xe: fix xe_device_mem_access_get() races
Date: Tue, 4 Jul 2023 20:59:52 +0530	[thread overview]
Message-ID: <69fb61e4-49e6-9809-6e9a-ee8a3f398959@intel.com> (raw)
In-Reply-To: <60fe1fd9-b9f8-3196-c43e-cace0c397f1b@intel.com>



On 7/4/2023 4:55 PM, Matthew Auld wrote:
> On 30/06/2023 16:22, Gupta, Anshuman wrote:
>>
>>
>> On 6/26/2023 4:20 PM, Matthew Auld wrote:
>>> It looks like there is at least one race here, given that the
>>> pm_runtime_suspended() check looks to return false if we are in the
>>> process of suspending the device (RPM_SUSPENDING vs RPM_SUSPENDED).  We
>>> later also do xe_pm_runtime_get_if_active(), but since the device is
>>> suspending or has now suspended, this doesn't do anything either.
>>> Following from this we can potentially return from
>>> xe_device_mem_access_get() with the device suspended or about to be,
>>> leading to broken behaviour.
>>>
>>> Attempt to fix this by always grabbing the runtime ref when our internal
>>> ref transitions from 0 -> 1. The hard part is then dealing with the
>>> runtime_pm callbacks also calling xe_device_mem_access_get() and
>>> deadlocking, which the pm_runtime_suspended() check prevented.
>>>
>>> v2:
>>>   - ct->lock looks to be primed with fs_reclaim, so holding that and 
>>> then
>>>     allocating memory will cause lockdep to complain. Now that we
>>>     unconditionally grab the mem_access.lock around 
>>> mem_access_{get,put}, we
>>>     need to change the ordering wrt to grabbing the ct->lock, since 
>>> some of
>>>     the runtime_pm routines can allocate memory (or at least that's what
>>>     lockdep seems to suggest). Hopefully not a big deal.  It might be 
>>> that
>>>     there were already issues with this, just that the atomics where
>>>     "hiding" the potential issues.
>>> v3:
>>>   - Use Thomas Hellström' idea with tracking the active task that is
>>>     executing in the resume or suspend callback, in order to avoid
>>>     recursive resume/suspend calls deadlocking on itself.
>>>   - Split the ct->lock change.
>>> v4:
>>>   - Add smb_mb() around accessing the pm_callback_task for extra safety.
>>>     (Thomas Hellström)
>>> v5:
>>>   - Clarify the kernel-doc for the mem_access.lock, given that it is 
>>> quite
>>>     strange in what it protects (data vs code). The real motivation 
>>> is to
>>>     aid lockdep. (Rodrigo Vivi)
>>> v6:
>>>   - Split out the lock change. We still want this as a lockdep aid but
>>>     only for the xe_device_mem_access_get() path. Sticking a lock on the
>>>     put() looks be a no-go, also the runtime_put() there is always 
>>> async.
>>>   - Now that the lock is gone move to atomics and rely on the pm code
>>>     serialising multiple callers on the 0 -> 1 transition.
>>>   - g2h_worker_func() looks to be the next issue, given that
>>>     suspend-resume callbacks are using CT, so try to handle that.
>>> v7:
>>>   - Add xe_device_mem_access_get_if_ongoing(), and use it in
>>>     g2h_worker_func().
>>>
>>> Closes: https://gitlab.freedesktop.org/drm/xe/kernel/-/issues/258
>>> Signed-off-by: Matthew Auld <matthew.auld@intel.com>
>>> Cc: Rodrigo Vivi <rodrigo.vivi@intel.com>
>>> Cc: Thomas Hellström <thomas.hellstrom@linux.intel.com>
>>> Cc: Matthew Brost <matthew.brost@intel.com>
>>> Cc: Anshuman Gupta <anshuman.gupta@intel.com>
>>> ---
>>>   drivers/gpu/drm/xe/xe_device.c       | 58 +++++++++++++++++++-----
>>>   drivers/gpu/drm/xe/xe_device.h       | 12 ++---
>>>   drivers/gpu/drm/xe/xe_device_types.h |  6 +++
>>>   drivers/gpu/drm/xe/xe_guc_ct.c       | 34 +++++++++++++-
>>>   drivers/gpu/drm/xe/xe_pm.c           | 66 +++++++++++++++++++---------
>>>   drivers/gpu/drm/xe/xe_pm.h           |  3 +-
>>>   6 files changed, 134 insertions(+), 45 deletions(-)
>>>
>>> diff --git a/drivers/gpu/drm/xe/xe_device.c 
>>> b/drivers/gpu/drm/xe/xe_device.c
>>> index c7985af85a53..1dc552da434f 100644
>>> --- a/drivers/gpu/drm/xe/xe_device.c
>>> +++ b/drivers/gpu/drm/xe/xe_device.c
>>> @@ -411,27 +411,61 @@ u32 xe_device_ccs_bytes(struct xe_device *xe, 
>>> u64 size)
>>>           DIV_ROUND_UP(size, NUM_BYTES_PER_CCS_BYTE) : 0;
>>>   }
>>> +bool xe_device_mem_access_ongoing(struct xe_device *xe)
>>> +{
>>> +    if (xe_pm_read_callback_task(xe) != NULL)
>>> +        return true;
>>> +
>>> +    return atomic_read(&xe->mem_access.ref);
>>> +}
>>> +
>>> +void xe_device_assert_mem_access(struct xe_device *xe)
>>> +{
>>> +    XE_WARN_ON(!xe_device_mem_access_ongoing(xe));
>>> +}
>>> +
>>> +bool xe_device_mem_access_get_if_ongoing(struct xe_device *xe)
>>> +{
>>> +    return atomic_inc_not_zero(&xe->mem_access.ref);
>>> +}
>>> +
>>>   void xe_device_mem_access_get(struct xe_device *xe)
>>>   {
>>> -    bool resumed = xe_pm_runtime_resume_if_suspended(xe);
>>> -    int ref = atomic_inc_return(&xe->mem_access.ref);
>>> +    /*
>>> +     * This looks racy, but should be fine since the 
>>> pm_callback_task only
>>> +     * transitions from NULL -> current (and back to NULL again), 
>>> during the
>>> +     * runtime_resume() or runtime_suspend() callbacks, for which 
>>> there can
>>> +     * only be a single one running for our device. We only need to 
>>> prevent
>>> +     * recursively calling the runtime_get or runtime_put from those
>>> +     * callbacks, as well as preventing triggering any access_ongoing
>>> +     * asserts.
>>> +     */
>> two runtime_suspend() can run in parallel for two different pci device 
>> those worker thread pooled by pm_wq workqueue, it is not guaranteed 
>> that tast_struct will be different for two worker spawned by same pm_wq ?
> 
> IIUC we only use pm_wq for the async put(). If somehow tast_struct can 
> be the same for different workers when using pm_wq (I'm not sure tbh), 
> then I think it's fine since all put() must be balanced with a get(), 
> and all get() are handled directly in the caller (not pm_wq) since it's 
> non-async, and must first wait for any running callback.
Agree with that.
> 
>>
>>> +    if (xe_pm_read_callback_task(xe) == current)
>>> +        return;
>>> -    if (ref == 1)
>>> -        xe->mem_access.hold_rpm = xe_pm_runtime_get_if_active(xe);
>>> +    if (!atomic_inc_not_zero(&xe->mem_access.ref)) {
>>> +        bool hold_rpm = xe_pm_runtime_resume_and_get(xe);
>>> +        int ref;
>>> -    /* The usage counter increased if device was immediately resumed */
>>> -    if (resumed)
>>> -        xe_pm_runtime_put(xe);
>>> -
>>> -    XE_WARN_ON(ref == S32_MAX);
>>> +        ref = atomic_inc_return(&xe->mem_access.ref);
>>> +        if (ref == 1)
>>> +            xe->mem_access.hold_rpm = hold_rpm;
>>> +        else
>>> +            xe_pm_runtime_put(xe);
AFAIU above is not possible ?
>>> +    } else {
>>> +        XE_WARN_ON(atomic_read(&xe->mem_access.ref) == S32_MAX);
>>> +    }
>>>   }
>>>   void xe_device_mem_access_put(struct xe_device *xe)
>>>   {
>>> -    bool hold = xe->mem_access.hold_rpm;
>>> -    int ref = atomic_dec_return(&xe->mem_access.ref);
>>> +    int ref;
>>> -    if (!ref && hold)
>>> +    if (xe_pm_read_callback_task(xe) == current)
>>> +        return;
>>> +
>>> +    ref = atomic_dec_return(&xe->mem_access.ref);
>>> +    if (ref == 0 && xe->mem_access.hold_rpm)
>>>           xe_pm_runtime_put(xe);
>>>       XE_WARN_ON(ref < 0);
>> /snip
>>> +
>>>   int xe_pm_runtime_suspend(struct xe_device *xe)
>>>   {
>>>       struct xe_gt *gt;
>>>       u8 id;
>>> -    int err;
>>> +    int err = 0;
>>> +
>>> +    if (xe->d3cold_allowed && xe_device_mem_access_ongoing(xe))
>>> +        return -EBUSY;
>> Not related to this patch but We should return -EBUSY even for d3hot 
>> as well.
> 
> Looking at this again, access_ongoing is always going to be false here, 
> right? On the 0 -> 1 transition we always do full sync pm get before 
> increment mem_access.ref, so not sure if this check actually does anything.
I belive this is paranoid check against any unbalanced put.
Br,
Anshuman Gupta.
> 
>> Br,
>> Anshuman Gupta
>>> +
>>> +    /* Disable access_ongoing asserts and prevent recursive pm calls */
>>> +    xe_pm_write_callback_task(xe, current);
>>>       if (xe->d3cold_allowed) {
>>> -        if (xe_device_mem_access_ongoing(xe))
>>> -            return -EBUSY;
>>> -
>>>           err = xe_bo_evict_all(xe);
>>>           if (err)
>>> -            return err;
>>> +            goto out;
>>>       }
>>>       for_each_gt(gt, xe, id) {
>>>           err = xe_gt_suspend(gt);
>>>           if (err)
>>> -            return err;
>>> +            goto out;
>>>       }
>>>       xe_irq_suspend(xe);
>>> -
>>> -    return 0;
>>> +out:
>>> +    xe_pm_write_callback_task(xe, NULL);
>>> +    return err;
>>>   }
>>>   int xe_pm_runtime_resume(struct xe_device *xe)
>>>   {
>>>       struct xe_gt *gt;
>>>       u8 id;
>>> -    int err;
>>> +    int err = 0;
>>> +
>>> +    /* Disable access_ongoing asserts and prevent recursive pm calls */
>>> +    xe_pm_write_callback_task(xe, current);
>>>       if (xe->d3cold_allowed) {
>>>           for_each_gt(gt, xe, id) {
>>>               err = xe_pcode_init(gt);
>>>               if (err)
>>> -                return err;
>>> +                goto out;
>>>           }
>>>           /*
>>> @@ -182,7 +210,7 @@ int xe_pm_runtime_resume(struct xe_device *xe)
>>>            */
>>>           err = xe_bo_restore_kernel(xe);
>>>           if (err)
>>> -            return err;
>>> +            goto out;
>>>       }
>>>       xe_irq_resume(xe);
>>> @@ -193,10 +221,11 @@ int xe_pm_runtime_resume(struct xe_device *xe)
>>>       if (xe->d3cold_allowed) {
>>>           err = xe_bo_restore_user(xe);
>>>           if (err)
>>> -            return err;
>>> +            goto out;
>>>       }
>>> -
>>> -    return 0;
>>> +out:
>>> +    xe_pm_write_callback_task(xe, NULL);
>>> +    return err;
>>>   }
>>>   int xe_pm_runtime_get(struct xe_device *xe)
>>> @@ -210,14 +239,9 @@ int xe_pm_runtime_put(struct xe_device *xe)
>>>       return pm_runtime_put_autosuspend(xe->drm.dev);
>>>   }
>>> -/* Return true if resume operation happened and usage count was 
>>> increased */
>>> -bool xe_pm_runtime_resume_if_suspended(struct xe_device *xe)
>>> +bool xe_pm_runtime_resume_and_get(struct xe_device *xe)
>>>   {
>>> -    /* In case we are suspended we need to immediately wake up */
>>> -    if (pm_runtime_suspended(xe->drm.dev))
>>> -        return !pm_runtime_resume_and_get(xe->drm.dev);
>>> -
>>> -    return false;
>>> +    return !pm_runtime_resume_and_get(xe->drm.dev);
>>>   }
>>>   int xe_pm_runtime_get_if_active(struct xe_device *xe)
>>> diff --git a/drivers/gpu/drm/xe/xe_pm.h b/drivers/gpu/drm/xe/xe_pm.h
>>> index 6a885585f653..e92c508d44b9 100644
>>> --- a/drivers/gpu/drm/xe/xe_pm.h
>>> +++ b/drivers/gpu/drm/xe/xe_pm.h
>>> @@ -19,7 +19,8 @@ int xe_pm_runtime_suspend(struct xe_device *xe);
>>>   int xe_pm_runtime_resume(struct xe_device *xe);
>>>   int xe_pm_runtime_get(struct xe_device *xe);
>>>   int xe_pm_runtime_put(struct xe_device *xe);
>>> -bool xe_pm_runtime_resume_if_suspended(struct xe_device *xe);
>>> +bool xe_pm_runtime_resume_and_get(struct xe_device *xe);
>>>   int xe_pm_runtime_get_if_active(struct xe_device *xe);
>>> +struct task_struct *xe_pm_read_callback_task(struct xe_device *xe);
>>>   #endif

  reply	other threads:[~2023-07-04 15:30 UTC|newest]

Thread overview: 36+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2023-06-26 10:50 [Intel-xe] [PATCH v12 00/13] xe_device_mem_access fixes and related bits Matthew Auld
2023-06-26 10:50 ` [Intel-xe] [PATCH v12 01/13] drm/xe: fix xe_device_mem_access_get() races Matthew Auld
2023-06-30 15:22   ` Gupta, Anshuman
2023-07-04 11:25     ` Matthew Auld
2023-07-04 15:29       ` Gupta, Anshuman [this message]
2023-07-04 16:00         ` Matthew Auld
2023-07-11  9:00           ` Gupta, Anshuman
2023-07-11 11:06             ` Matthew Auld
2023-07-11 17:56               ` Gupta, Anshuman
2023-06-26 10:50 ` [Intel-xe] [PATCH v12 02/13] drm/xe/vm: tidy up xe_runtime_pm usage Matthew Auld
2023-06-26 10:50 ` [Intel-xe] [PATCH v12 03/13] drm/xe/debugfs: grab mem_access around forcewake Matthew Auld
2023-06-26 10:50 ` [Intel-xe] [PATCH v12 04/13] drm/xe/guc_pc: add missing mem_access for freq_rpe_show Matthew Auld
2023-06-27  6:53   ` Gupta, Anshuman
2023-06-27  8:20     ` Matthew Auld
2023-06-27 10:14       ` Gupta, Anshuman
2023-06-26 10:50 ` [Intel-xe] [PATCH v12 05/13] drm/xe/mmio: grab mem_access in xe_mmio_ioctl Matthew Auld
2023-06-26 10:50 ` [Intel-xe] [PATCH v12 06/13] drm/xe: ensure correct access_put ordering Matthew Auld
2023-06-26 10:50 ` [Intel-xe] [PATCH v12 07/13] drm/xe/pci: wrap probe with mem_access Matthew Auld
2023-06-26 10:50 ` [Intel-xe] [PATCH v12 08/13] drm/xe/display: use mem_access underneath Matthew Auld
2023-06-28  9:51   ` Gupta, Anshuman
2023-06-29  9:19     ` Matthew Auld
2023-06-26 10:50 ` [Intel-xe] [PATCH v12 09/13] drm/xe/mmio: enforce xe_device_assert_mem_access Matthew Auld
2023-06-26 10:50 ` [Intel-xe] [PATCH v12 10/13] drm/xe: drop xe_device_mem_access_get() from guc_ct_send Matthew Auld
2023-06-26 10:50 ` [Intel-xe] [PATCH v12 11/13] drm/xe/ggtt: prime ggtt->lock against FS_RECLAIM Matthew Auld
2023-06-26 10:50 ` [Intel-xe] [PATCH v12 12/13] drm/xe: drop xe_device_mem_access_get() from invalidation_vma Matthew Auld
2023-06-26 10:50 ` [Intel-xe] [PATCH v12 13/13] drm/xe: add lockdep annotation for xe_device_mem_access_get() Matthew Auld
2023-06-26 12:55 ` [Intel-xe] ✓ CI.Patch_applied: success for xe_device_mem_access fixes and related bits (rev2) Patchwork
2023-06-26 12:56 ` [Intel-xe] ✗ CI.checkpatch: warning " Patchwork
2023-06-26 12:57 ` [Intel-xe] ✓ CI.KUnit: success " Patchwork
2023-06-26 13:01 ` [Intel-xe] ✓ CI.Build: " Patchwork
2023-06-26 13:01 ` [Intel-xe] ✓ CI.Hooks: " Patchwork
2023-06-26 13:02 ` [Intel-xe] ✓ CI.checksparse: " Patchwork
2023-06-26 13:46 ` [Intel-xe] ○ CI.BAT: info " Patchwork
2023-06-30  6:21 ` [Intel-xe] [PATCH v12 00/13] xe_device_mem_access fixes and related bits Dixit, Ashutosh
2023-06-30 11:07   ` Matthew Auld
2023-06-30 16:59     ` Dixit, Ashutosh

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=69fb61e4-49e6-9809-6e9a-ee8a3f398959@intel.com \
    --to=anshuman.gupta@intel.com \
    --cc=intel-xe@lists.freedesktop.org \
    --cc=matthew.auld@intel.com \
    --cc=rodrigo.vivi@intel.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox