Intel-XE Archive on lore.kernel.org
 help / color / mirror / Atom feed
From: Jonathan Cavitt <jonathan.cavitt@intel.com>
To: intel-xe@lists.freedesktop.org
Cc: saurabhg.gupta@intel.com, alex.zuo@intel.com,
	jonathan.cavitt@intel.com, joonas.lahtinen@linux.intel.com,
	matthew.brost@intel.com, jianxun.zhang@intel.com,
	shuicheng.lin@intel.com, dri-devel@lists.freedesktop.org,
	Michal.Wajdeczko@intel.com, michal.mrozek@intel.com
Subject: [PATCH v10 0/5] drm/xe/xe_vm: Implement xe_vm_get_property_ioctl
Date: Thu, 20 Mar 2025 15:26:10 +0000	[thread overview]
Message-ID: <20250320152616.74587-1-jonathan.cavitt@intel.com> (raw)

Add additional information to each VM so they can report up to the first
50 seen faults.  Only pagefaults are saved this way currently, though in
the future, all faults should be tracked by the VM for future reporting.

Additionally, of the pagefaults reported, only failed pagefaults are
saved this way, as successful pagefaults should recover silently and not
need to be reported to userspace.

To allow userspace to access these faults, a new ioctl -
xe_vm_get_property_ioct - was created.

v2: (Matt Brost)
- Break full ban list request into a separate property.
- Reformat drm_xe_vm_get_property struct.
- Remove need for drm_xe_faults helper struct.
- Separate data pointer and scalar return value in ioctl.
- Get address type on pagefault report and save it to the pagefault.
- Correctly reject writes to read-only VMAs.
- Miscellaneous formatting fixes.

v3: (Matt Brost)
- Only allow querying of failed pagefaults

v4:
- Remove unnecessary size parameter from helper function, as it
  is a property of the arguments. (jcavitt)
- Remove unnecessary copy_from_user (Jainxun)
- Set address_precision to 1 (Jainxun)
- Report max size instead of dynamic size for memory allocation
  purposes.  Total memory usage is reported separately.

v5:
- Return int from xe_vm_get_property_size (Shuicheng)
- Fix memory leak (Shuicheng)
- Remove unnecessary size variable (jcavitt)

v6:
- Free vm after use (Shuicheng)
- Compress pf copy logic (Shuicheng)
- Update fault_unsuccessful before storing (Shuicheng)
- Fix old struct name in comments (Shuicheng)
- Keep first 50 pagefaults instead of last 50 (Jianxun)
- Rename ioctl to xe_vm_get_faults_ioctl (jcavitt)

v7:
- Avoid unnecessary execution by checking MAX_PFS earlier (jcavitt)
- Fix double-locking error (jcavitt)
- Assert kmemdump is successful (Shuicheng)
- Repair and move fill_faults break condition (Dan Carpenter)
- Free vm after use (jcavitt)
- Combine assertions (jcavitt)
- Expand size check in xe_vm_get_faults_ioctl (jcavitt)
- Remove return mask from fill_faults, as return is already -EFAULT or 0
  (jcavitt)

v8:
- Revert back to using drm_xe_vm_get_property_ioctl
- s/Migrate/Move (Michal)
- s/xe_pagefault/xe_gt_pagefault (Michal)
- Create new header file, xe_gt_pagefault_types.h (Michal)
- Add and fix kernel docs (Michal)
- Rename xe_vm.pfs to xe_vm.faults (jcavitt)
- Store fault data and not pagefault in xe_vm faults list (jcavitt)
- Store address, address type, and address precision per fault (jcavitt)
- Store engine class and instance data per fault (Jianxun)
- Properly handle kzalloc error (Michal W)
- s/MAX_PFS/MAX_FAULTS_SAVED_PER_VM (Michal W)
- Store fault level per fault (Micahl M)
- Apply better copy_to_user logic (jcavitt)

v9:
- More kernel doc fixes (Michal W, Jianxun)
- Better error handling (jcavitt)

v10:
- Convert enums to defines in regs folder (Michal W)
- Move xe_guc_pagefault_desc to regs folder (Michal W)
- Future-proof size logic for zero-size properties (jcavitt)
- Replace address type extern with access type (Jianxun)
- Add fault type to xe_drm_fault (Jianxun)

Signed-off-by: Jonathan Cavitt <joanthan.cavitt@intel.com>
Suggested-by: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
Suggested-by: Matthew Brost <matthew.brost@intel.com>
Cc: Zhang Jianxun <jianxun.zhang@intel.com>
Cc: Shuicheng Lin <shuicheng.lin@intel.com>
Cc: Michal Wajdeczko <Michal.Wajdeczko@intel.com>
Cc: Michal Mrozek <michal.mrozek@intel.com>

Jonathan Cavitt (5):
  drm/xe/xe_gt_pagefault: Disallow writes to read-only VMAs
  drm/xe/xe_gt_pagefault: Move pagefault struct to header
  drm/xe/uapi: Define drm_xe_vm_get_property
  drm/xe/xe_vm: Add per VM fault info
  drm/xe/xe_vm: Implement xe_vm_get_property_ioctl

 drivers/gpu/drm/xe/regs/xe_pagefault_desc.h |  50 +++++
 drivers/gpu/drm/xe/xe_device.c              |   3 +
 drivers/gpu/drm/xe/xe_gt_pagefault.c        |  67 +++----
 drivers/gpu/drm/xe/xe_gt_pagefault_types.h  |  42 +++++
 drivers/gpu/drm/xe/xe_guc_fwif.h            |  28 ---
 drivers/gpu/drm/xe/xe_vm.c                  | 194 ++++++++++++++++++++
 drivers/gpu/drm/xe/xe_vm.h                  |  11 ++
 drivers/gpu/drm/xe/xe_vm_types.h            |  34 ++++
 include/uapi/drm/xe_drm.h                   |  79 ++++++++
 9 files changed, 447 insertions(+), 61 deletions(-)
 create mode 100644 drivers/gpu/drm/xe/regs/xe_pagefault_desc.h
 create mode 100644 drivers/gpu/drm/xe/xe_gt_pagefault_types.h

-- 
2.43.0


             reply	other threads:[~2025-03-20 15:26 UTC|newest]

Thread overview: 21+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2025-03-20 15:26 Jonathan Cavitt [this message]
2025-03-20 15:26 ` [PATCH v10 1/5] drm/xe/xe_gt_pagefault: Disallow writes to read-only VMAs Jonathan Cavitt
2025-03-20 15:26 ` [PATCH v10 2/5] drm/xe/xe_gt_pagefault: Move pagefault struct to header Jonathan Cavitt
2025-03-20 15:26 ` [PATCH v10 3/5] drm/xe/uapi: Define drm_xe_vm_get_property Jonathan Cavitt
2025-03-20 15:26 ` [PATCH v10 4/5] drm/xe/xe_vm: Add per VM fault info Jonathan Cavitt
2025-03-20 15:26 ` [PATCH v10 5/5] drm/xe/xe_vm: Implement xe_vm_get_property_ioctl Jonathan Cavitt
2025-03-21 23:36   ` Raag Jadav
2025-03-24 16:57     ` Cavitt, Jonathan
2025-03-24 21:25       ` Raag Jadav
2025-03-24 21:31         ` Cavitt, Jonathan
2025-03-25  7:19           ` Raag Jadav
2025-03-25 14:44             ` Cavitt, Jonathan
2025-03-25 23:36               ` Raag Jadav
2025-03-20 16:27 ` ✓ CI.Patch_applied: success for drm/xe/xe_vm: Implement xe_vm_get_property_ioctl (rev11) Patchwork
2025-03-20 16:27 ` ✗ CI.checkpatch: warning " Patchwork
2025-03-20 16:28 ` ✓ CI.KUnit: success " Patchwork
2025-03-20 16:45 ` ✓ CI.Build: " Patchwork
2025-03-20 16:47 ` ✗ CI.Hooks: failure " Patchwork
2025-03-20 16:48 ` ✓ CI.checksparse: success " Patchwork
2025-03-20 17:08 ` ✓ Xe.CI.BAT: " Patchwork
2025-03-20 18:01 ` ✗ Xe.CI.Full: failure " Patchwork

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20250320152616.74587-1-jonathan.cavitt@intel.com \
    --to=jonathan.cavitt@intel.com \
    --cc=Michal.Wajdeczko@intel.com \
    --cc=alex.zuo@intel.com \
    --cc=dri-devel@lists.freedesktop.org \
    --cc=intel-xe@lists.freedesktop.org \
    --cc=jianxun.zhang@intel.com \
    --cc=joonas.lahtinen@linux.intel.com \
    --cc=matthew.brost@intel.com \
    --cc=michal.mrozek@intel.com \
    --cc=saurabhg.gupta@intel.com \
    --cc=shuicheng.lin@intel.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox