Intel-XE Archive on lore.kernel.org
 help / color / mirror / Atom feed
* [RFC 0/9] Fix xe_force_wake_get() failure handling
@ 2024-08-30  5:23 Himal Prasad Ghimiray
  2024-08-30  5:18 ` ✓ CI.Patch_applied: success for " Patchwork
                   ` (16 more replies)
  0 siblings, 17 replies; 34+ messages in thread
From: Himal Prasad Ghimiray @ 2024-08-30  5:23 UTC (permalink / raw)
  To: intel-xe
  Cc: Himal Prasad Ghimiray, Badal Nilawar, Rodrigo Vivi,
	Lucas De Marchi, Nirmoy Das, Matthew Brost,
	Daniele Ceraolo Spurio, Ashutosh Dixit

In case of xe_force_wake_get() call the refcount for domains
are incremented irrespective of failure or success, which can lead to
undefined behaviour: 
[1] subsequent forcewake call by callee function assumes domains are 
already awake, which might not be true. This shows perfectly balanced 
xe_force_wake_get/_put can also cause problem.

[2] The functions are using xe_force_wake_get() and return on failures.
In this scenario only the first caller returns and other functions will
always assume domains to be awake.

[1] func_a() {
	XE_WARN(xe_force_wake_get()) <---> fails but increments refcount

	func_b();

	XE_WARN(xe_force_wake_put());<---> decrements refcounts 
 }

    func_b() {
	if(xe_force_wake_get()) <---> succeeds due to refcount of caller
		return;

	does mmio_operations(); <---> Domain might not be awake

	xe_force_wake_put(); <---> decrement refcount
 }
  	
[2] func_a() {
	int err = xe_force_wake_get(); <---> returns if fails, inc refcount
	if (err)
		return;

	mmio_operations();

	xe_force_wake_put(); 
 }

    func_b() {
        if (xe_force_wake_get()) <---> succeeds due to refcnt of func_a()
              return;

        mmio_operations(); <---> Invalid accesses

        xe_force_wake_put(); <---> decrements refcount for func_b
 }

This RFC proposes not incrementing the refcount in case of
xe_force_wake_get() failure and putting all domains awakened by the caller 
to sleep and decrease the reference count for all requested domains. 
This prevents xe_force_wake_get() from leaving an unhandled reference 
count in case of failure. 

With these changes in place caller needs to ensure xe_force_wake_put() is
called only if xe_force_wake_get() succeeds.

Patches breakdown:

1. Changes in xe_force_wake_get() and documentations
   - Patches 1 to 2

2. Ensure xe_force_wake_put() is called when xe_force_wake_get() succeds
   - Patches 3 to 7

3. Modify xe_force_wake_put to return void and warn for failure
   - Patch 8

4. Debugfs handling for forcewake open
   - Patch 9

Cc: Badal Nilawar <badal.nilawar@intel.com>
Cc: Rodrigo Vivi <rodrigo.vivi@intel.com>
Cc: Lucas De Marchi <lucas.demarchi@intel.com>
Cc: Nirmoy Das <nirmoy.das@intel.com>
Cc: Matthew Brost <matthew.brost@intel.com>
Cc: Daniele Ceraolo Spurio <daniele.ceraolospurio@intel.com>
Cc: Ashutosh Dixit <ashutosh.dixit@intel.com>
Himal Prasad Ghimiray (9):
  drm/xe: Error handling in xe_force_wake_get()
  drm/xe: Ensure __must_check for xe_force_wake_get() return
  drm/xe/gsc: call xe_force_wake_put() only if xe_force_wake_get()
    succeeds
  drm/xe/gt: call xe_force_wake_put() only if xe_force_wake_get()
    succeeds
  drm/xe/guc: call xe_force_wake_put() only if xe_force_wake_get()
    succeeds
  drm/xe/oa: Handle force_wake_get failure in xe_oa_stream_init()
  drm/xe/gt_tlb_invalidation_ggtt: Call xe_force_wake_put if
    xe_force_wake_get succeds
  drm/xe: Change return type to void for xe_force_wake_put
  drm/xe: forcewake debugfs open fails on xe_forcewake_get failure

 drivers/gpu/drm/xe/xe_debugfs.c             | 28 +++++++--
 drivers/gpu/drm/xe/xe_device.c              |  3 +-
 drivers/gpu/drm/xe/xe_drm_client.c          |  2 +-
 drivers/gpu/drm/xe/xe_force_wake.c          | 64 +++++++++++++++++----
 drivers/gpu/drm/xe/xe_force_wake.h          |  9 +--
 drivers/gpu/drm/xe/xe_gsc.c                 | 14 +++--
 drivers/gpu/drm/xe/xe_gt.c                  | 25 ++++----
 drivers/gpu/drm/xe/xe_gt_debugfs.c          |  2 +-
 drivers/gpu/drm/xe/xe_gt_idle.c             | 25 +++++---
 drivers/gpu/drm/xe/xe_gt_tlb_invalidation.c |  9 ++-
 drivers/gpu/drm/xe/xe_guc.c                 |  9 ++-
 drivers/gpu/drm/xe/xe_guc_pc.c              | 15 +++--
 drivers/gpu/drm/xe/xe_oa.c                  | 13 +++--
 drivers/gpu/drm/xe/xe_pat.c                 | 10 ++--
 drivers/gpu/drm/xe/xe_reg_sr.c              |  6 +-
 drivers/gpu/drm/xe/xe_vram.c                |  3 +-
 16 files changed, 165 insertions(+), 72 deletions(-)

-- 
2.34.1


^ permalink raw reply	[flat|nested] 34+ messages in thread

end of thread, other threads:[~2024-09-11  6:51 UTC | newest]

Thread overview: 34+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2024-08-30  5:23 [RFC 0/9] Fix xe_force_wake_get() failure handling Himal Prasad Ghimiray
2024-08-30  5:18 ` ✓ CI.Patch_applied: success for " Patchwork
2024-08-30  5:18 ` ✓ CI.checkpatch: " Patchwork
2024-08-30  5:19 ` ✓ CI.KUnit: " Patchwork
2024-08-30  5:23 ` [RFC 1/9] drm/xe: Error handling in xe_force_wake_get() Himal Prasad Ghimiray
2024-08-30  6:37   ` Jani Nikula
2024-08-30  6:45     ` Ghimiray, Himal Prasad
2024-09-05 19:29   ` Rodrigo Vivi
2024-09-05 20:02     ` Ghimiray, Himal Prasad
2024-09-06 16:18       ` Rodrigo Vivi
2024-09-10 18:27         ` Nilawar, Badal
2024-09-11  6:51           ` Ghimiray, Himal Prasad
2024-09-11  6:40       ` Upadhyay, Tejas
2024-08-30  5:23 ` [RFC 2/9] drm/xe: Ensure __must_check for xe_force_wake_get() return Himal Prasad Ghimiray
2024-09-05 19:30   ` Rodrigo Vivi
2024-08-30  5:23 ` [RFC 3/9] drm/xe/gsc: call xe_force_wake_put() only if xe_force_wake_get() succeeds Himal Prasad Ghimiray
2024-08-30  5:23 ` [RFC 4/9] drm/xe/gt: " Himal Prasad Ghimiray
2024-08-30  5:23 ` [RFC 5/9] drm/xe/guc: " Himal Prasad Ghimiray
2024-08-30  5:23 ` [RFC 6/9] drm/xe/oa: Handle force_wake_get failure in xe_oa_stream_init() Himal Prasad Ghimiray
2024-08-30  5:23 ` [RFC 7/9] drm/xe/gt_tlb_invalidation_ggtt: Call xe_force_wake_put if xe_force_wake_get succeds Himal Prasad Ghimiray
2024-09-05 19:37   ` Rodrigo Vivi
2024-09-05 19:51     ` Ghimiray, Himal Prasad
2024-09-06 16:29       ` Rodrigo Vivi
2024-09-09  9:29         ` Ghimiray, Himal Prasad
2024-09-10 14:37           ` Nilawar, Badal
2024-09-10 17:39             ` Rodrigo Vivi
2024-09-10 17:53               ` Nilawar, Badal
2024-08-30  5:23 ` [RFC 8/9] drm/xe: Change return type to void for xe_force_wake_put Himal Prasad Ghimiray
2024-08-30  5:23 ` [RFC 9/9] drm/xe: forcewake debugfs open fails on xe_forcewake_get failure Himal Prasad Ghimiray
2024-08-30  5:32 ` ✓ CI.Build: success for Fix xe_force_wake_get() failure handling Patchwork
2024-08-30  5:37 ` ✓ CI.Hooks: " Patchwork
2024-08-30  5:42 ` ✓ CI.checksparse: " Patchwork
2024-08-30  6:05 ` ✓ CI.BAT: " Patchwork
2024-08-30 17:41 ` ✓ CI.FULL: " Patchwork

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox