From: Matthew Auld <matthew.auld@intel.com>
To: Niranjana Vishwanathapura <niranjana.vishwanathapura@intel.com>,
intel-gfx@lists.freedesktop.org, dri-devel@lists.freedesktop.org
Cc: paulo.r.zanoni@intel.com, jani.nikula@intel.com,
thomas.hellstrom@intel.com, daniel.vetter@intel.com,
christian.koenig@amd.com
Subject: Re: [Intel-gfx] [PATCH v9 23/23] drm/i915/vm_bind: Support capture of persistent mappings
Date: Tue, 13 Dec 2022 12:03:07 +0000 [thread overview]
Message-ID: <f684e366-417e-e087-764f-390f65ebd0f9@intel.com> (raw)
In-Reply-To: <20221212231527.2384-24-niranjana.vishwanathapura@intel.com>
On 12/12/2022 23:15, Niranjana Vishwanathapura wrote:
> Support dump capture of persistent mappings upon user request.
>
> Capture of a mapping is requested with the VM_BIND ioctl and
> processed during the GPU error handling, thus not adding any
> additional latency to the submission path.
>
> A list of persistent vmas requiring capture is maintained
> instead of a list of vma resources. This allows for no
> additional handling around eviction.
>
> v2: enable with CONFIG_DRM_I915_CAPTURE_ERROR, remove gfp
> overwrite, add kernel-doc and expand commit message
>
> Signed-off-by: Brian Welty <brian.welty@intel.com>
> Signed-off-by: Niranjana Vishwanathapura <niranjana.vishwanathapura@intel.com>
> ---
> .../gpu/drm/i915/gem/i915_gem_vm_bind_object.c | 13 +++++++++++++
> drivers/gpu/drm/i915/gt/intel_gtt.c | 5 +++++
> drivers/gpu/drm/i915/gt/intel_gtt.h | 7 +++++++
> drivers/gpu/drm/i915/i915_gpu_error.c | 18 +++++++++++++++++-
> drivers/gpu/drm/i915/i915_vma.c | 4 ++++
> drivers/gpu/drm/i915/i915_vma_types.h | 4 ++++
> include/uapi/drm/i915_drm.h | 9 +++++++--
> 7 files changed, 57 insertions(+), 3 deletions(-)
>
> diff --git a/drivers/gpu/drm/i915/gem/i915_gem_vm_bind_object.c b/drivers/gpu/drm/i915/gem/i915_gem_vm_bind_object.c
> index 78e7c0642c5f..562a67a988f2 100644
> --- a/drivers/gpu/drm/i915/gem/i915_gem_vm_bind_object.c
> +++ b/drivers/gpu/drm/i915/gem/i915_gem_vm_bind_object.c
> @@ -88,6 +88,12 @@ static void i915_gem_vm_bind_remove(struct i915_vma *vma, bool release_obj)
> {
> lockdep_assert_held(&vma->vm->vm_bind_lock);
>
> +#if IS_ENABLED(CONFIG_DRM_I915_CAPTURE_ERROR)
> + mutex_lock(&vma->vm->vm_capture_lock);
> + if (!list_empty(&vma->vm_capture_link))
> + list_del_init(&vma->vm_capture_link);
> + mutex_unlock(&vma->vm->vm_capture_lock);
> +#endif
> spin_lock(&vma->vm->vm_rebind_lock);
> if (!list_empty(&vma->vm_rebind_link))
> list_del_init(&vma->vm_rebind_link);
> @@ -357,6 +363,13 @@ static int i915_gem_vm_bind_obj(struct i915_address_space *vm,
> continue;
> }
>
> +#if IS_ENABLED(CONFIG_DRM_I915_CAPTURE_ERROR)
> + if (va->flags & I915_GEM_VM_BIND_CAPTURE) {
> + mutex_lock(&vm->vm_capture_lock);
> + list_add_tail(&vma->vm_capture_link, &vm->vm_capture_list);
> + mutex_unlock(&vm->vm_capture_lock);
> + }
> +#endif
> list_add_tail(&vma->vm_bind_link, &vm->vm_bound_list);
> i915_vm_bind_it_insert(vma, &vm->va);
> if (!obj->priv_root)
> diff --git a/drivers/gpu/drm/i915/gt/intel_gtt.c b/drivers/gpu/drm/i915/gt/intel_gtt.c
> index 2e4c9fabf3b8..103ca55222be 100644
> --- a/drivers/gpu/drm/i915/gt/intel_gtt.c
> +++ b/drivers/gpu/drm/i915/gt/intel_gtt.c
> @@ -297,6 +297,11 @@ void i915_address_space_init(struct i915_address_space *vm, int subclass)
> spin_lock_init(&vm->vm_rebind_lock);
> spin_lock_init(&vm->userptr_invalidated_lock);
> INIT_LIST_HEAD(&vm->userptr_invalidated_list);
> +
> +#if IS_ENABLED(CONFIG_DRM_I915_CAPTURE_ERROR)
> + INIT_LIST_HEAD(&vm->vm_capture_list);
> + mutex_init(&vm->vm_capture_lock);
> +#endif
> }
>
> void *__px_vaddr(struct drm_i915_gem_object *p)
> diff --git a/drivers/gpu/drm/i915/gt/intel_gtt.h b/drivers/gpu/drm/i915/gt/intel_gtt.h
> index 620b4e020a9f..7f69e1d4fb5e 100644
> --- a/drivers/gpu/drm/i915/gt/intel_gtt.h
> +++ b/drivers/gpu/drm/i915/gt/intel_gtt.h
> @@ -281,6 +281,13 @@ struct i915_address_space {
> /** @root_obj: root object for dma-resv sharing by private objects */
> struct drm_i915_gem_object *root_obj;
>
> +#if IS_ENABLED(CONFIG_DRM_I915_CAPTURE_ERROR)
> + /* @vm_capture_list: list of vm captures */
> + struct list_head vm_capture_list;
> + /* @vm_capture_lock: protects vm_capture_list */
> + struct mutex vm_capture_lock;
> +#endif
> +
> /* Global GTT */
> bool is_ggtt:1;
>
> diff --git a/drivers/gpu/drm/i915/i915_gpu_error.c b/drivers/gpu/drm/i915/i915_gpu_error.c
> index 9d5d5a397b64..76b2834ce958 100644
> --- a/drivers/gpu/drm/i915/i915_gpu_error.c
> +++ b/drivers/gpu/drm/i915/i915_gpu_error.c
> @@ -1460,6 +1460,22 @@ capture_vma(struct intel_engine_capture_vma *next,
> return next;
> }
>
> +static struct intel_engine_capture_vma *
> +capture_user_vm(struct intel_engine_capture_vma *capture,
> + struct i915_address_space *vm, gfp_t gfp)
> +{
> + struct i915_vma *vma;
> +
> + mutex_lock(&vm->vm_capture_lock);
> + /* vma->resource must be valid here as persistent vmas are bound */
> + list_for_each_entry(vma, &vm->vm_capture_list, vm_capture_link)
> + capture = capture_vma_snapshot(capture, vma->resource,
Thinking some more on this, I don't think we can actually do this. The
vma->resource at this stage could be complete nonsense (could even be
NULL?), if you consider piplelined migrations. For example if we async
evict something, the object state can be freely updated (maybe even more
than once), even though the dma-resv is still active with fences. This
is allowed since the actual move(s) will be pipelined and should respect
those fences. In eb2 this is solved by holding the object lock and
taking a snaphot of the vma at submit time, which should ensure we are
capturing the correct vma->resource and sg_table. Maybe I'm missing
something.
> + gfp, "user");
> + mutex_unlock(&vm->vm_capture_lock);
> +
> + return capture;
> +}
> +
> static struct intel_engine_capture_vma *
> capture_user(struct intel_engine_capture_vma *capture,
> const struct i915_request *rq,
> @@ -1471,7 +1487,7 @@ capture_user(struct intel_engine_capture_vma *capture,
> capture = capture_vma_snapshot(capture, c->vma_res, gfp,
> "user");
>
> - return capture;
> + return capture_user_vm(capture, rq->context->vm, gfp);
> }
>
> static void add_vma(struct intel_engine_coredump *ee,
> diff --git a/drivers/gpu/drm/i915/i915_vma.c b/drivers/gpu/drm/i915/i915_vma.c
> index d092a86123ae..9be8aa448874 100644
> --- a/drivers/gpu/drm/i915/i915_vma.c
> +++ b/drivers/gpu/drm/i915/i915_vma.c
> @@ -248,6 +248,10 @@ vma_create(struct drm_i915_gem_object *obj,
> INIT_LIST_HEAD(&vma->non_priv_vm_bind_link);
> INIT_LIST_HEAD(&vma->vm_rebind_link);
> INIT_LIST_HEAD(&vma->userptr_invalidated_link);
> +
> +#if IS_ENABLED(CONFIG_DRM_I915_CAPTURE_ERROR)
> + INIT_LIST_HEAD(&vma->vm_capture_link);
> +#endif
> return vma;
>
> err_unlock:
> diff --git a/drivers/gpu/drm/i915/i915_vma_types.h b/drivers/gpu/drm/i915/i915_vma_types.h
> index 89f9854a6f69..c4fd61d51ce6 100644
> --- a/drivers/gpu/drm/i915/i915_vma_types.h
> +++ b/drivers/gpu/drm/i915/i915_vma_types.h
> @@ -310,6 +310,10 @@ struct i915_vma {
> struct list_head vm_rebind_link; /* Link in vm_rebind_list */
> /** @userptr_invalidated_link: link to the vm->userptr_invalidated_list */
> struct list_head userptr_invalidated_link;
> +#if IS_ENABLED(CONFIG_DRM_I915_CAPTURE_ERROR)
> + /* @vm_capture_link: link to the captureable VMA list */
> + struct list_head vm_capture_link;
> +#endif
>
> /** Timeline fence for vm_bind completion notification */
> struct {
> diff --git a/include/uapi/drm/i915_drm.h b/include/uapi/drm/i915_drm.h
> index b9167f950327..5fde6020e339 100644
> --- a/include/uapi/drm/i915_drm.h
> +++ b/include/uapi/drm/i915_drm.h
> @@ -3925,12 +3925,17 @@ struct drm_i915_gem_vm_bind {
> __u64 length;
>
> /**
> - * @flags: Currently reserved, MBZ.
> + * @flags: Supported flags are:
> + *
> + * I915_GEM_VM_BIND_CAPTURE:
> + * Capture this mapping in the dump upon GPU error.
> + * CONFIG_DRM_I915_CAPTURE_ERROR should be enabled for valid capture.
> *
> * Note that @fence carries its own flags.
> */
> __u64 flags;
> -#define __I915_GEM_VM_BIND_UNKNOWN_FLAGS (~0ull)
> +#define I915_GEM_VM_BIND_CAPTURE (1ull << 0)
> +#define __I915_GEM_VM_BIND_UNKNOWN_FLAGS (-(I915_GEM_VM_BIND_CAPTURE << 1))
>
> /** @rsvd: Reserved, MBZ */
> __u64 rsvd[2];
next prev parent reply other threads:[~2022-12-13 12:03 UTC|newest]
Thread overview: 32+ messages / expand[flat|nested] mbox.gz Atom feed top
2022-12-12 23:15 [Intel-gfx] [PATCH v9 00/23] drm/i915/vm_bind: Add VM_BIND functionality Niranjana Vishwanathapura
2022-12-12 23:15 ` [Intel-gfx] [PATCH v9 01/23] drm/i915/vm_bind: Expose vm lookup function Niranjana Vishwanathapura
2022-12-12 23:15 ` [Intel-gfx] [PATCH v9 02/23] drm/i915/vm_bind: Add __i915_sw_fence_await_reservation() Niranjana Vishwanathapura
2022-12-12 23:15 ` [Intel-gfx] [PATCH v9 03/23] drm/i915/vm_bind: Expose i915_gem_object_max_page_size() Niranjana Vishwanathapura
2022-12-12 23:15 ` [Intel-gfx] [PATCH v9 04/23] drm/i915/vm_bind: Support partially mapped vma resource Niranjana Vishwanathapura
2022-12-12 23:15 ` [Intel-gfx] [PATCH v9 05/23] drm/i915/vm_bind: Add support to create persistent vma Niranjana Vishwanathapura
2022-12-12 23:15 ` [Intel-gfx] [PATCH v9 06/23] drm/i915/vm_bind: Implement bind and unbind of object Niranjana Vishwanathapura
2022-12-12 23:15 ` [Intel-gfx] [PATCH v9 07/23] drm/i915/vm_bind: Support for VM private BOs Niranjana Vishwanathapura
2022-12-12 23:15 ` [Intel-gfx] [PATCH v9 08/23] drm/i915/vm_bind: Add support to handle object evictions Niranjana Vishwanathapura
2022-12-12 23:15 ` [Intel-gfx] [PATCH v9 09/23] drm/i915/vm_bind: Support persistent vma activeness tracking Niranjana Vishwanathapura
2022-12-12 23:15 ` [Intel-gfx] [PATCH v9 10/23] drm/i915/vm_bind: Add out fence support Niranjana Vishwanathapura
2023-01-11 1:48 ` Zanoni, Paulo R
2023-01-13 18:02 ` Niranjana Vishwanathapura
2022-12-12 23:15 ` [Intel-gfx] [PATCH v9 11/23] drm/i915/vm_bind: Abstract out common execbuf functions Niranjana Vishwanathapura
2022-12-12 23:15 ` [Intel-gfx] [PATCH v9 12/23] drm/i915/vm_bind: Use common execbuf functions in execbuf path Niranjana Vishwanathapura
2022-12-12 23:15 ` [Intel-gfx] [PATCH v9 13/23] drm/i915/vm_bind: Implement I915_GEM_EXECBUFFER3 ioctl Niranjana Vishwanathapura
2022-12-12 23:15 ` [Intel-gfx] [PATCH v9 14/23] drm/i915/vm_bind: Update i915_vma_verify_bind_complete() Niranjana Vishwanathapura
2022-12-12 23:15 ` [Intel-gfx] [PATCH v9 15/23] drm/i915/vm_bind: Expose i915_request_await_bind() Niranjana Vishwanathapura
2022-12-12 23:15 ` [Intel-gfx] [PATCH v9 16/23] drm/i915/vm_bind: Handle persistent vmas in execbuf3 Niranjana Vishwanathapura
2022-12-12 23:15 ` [Intel-gfx] [PATCH v9 17/23] drm/i915/vm_bind: userptr dma-resv changes Niranjana Vishwanathapura
2022-12-12 23:15 ` [Intel-gfx] [PATCH v9 18/23] drm/i915/vm_bind: Limit vm_bind mode to non-recoverable contexts Niranjana Vishwanathapura
2022-12-12 23:15 ` [Intel-gfx] [PATCH v9 19/23] drm/i915/vm_bind: Add uapi for user to enable vm_bind_mode Niranjana Vishwanathapura
2022-12-12 23:15 ` [Intel-gfx] [PATCH v9 20/23] drm/i915/vm_bind: Render VM_BIND documentation Niranjana Vishwanathapura
2022-12-12 23:15 ` [Intel-gfx] [PATCH v9 21/23] drm/i915/vm_bind: Async vm_unbind support Niranjana Vishwanathapura
2022-12-12 23:15 ` [Intel-gfx] [PATCH v9 22/23] drm/i915/vm_bind: Properly build persistent map sg table Niranjana Vishwanathapura
2022-12-12 23:15 ` [Intel-gfx] [PATCH v9 23/23] drm/i915/vm_bind: Support capture of persistent mappings Niranjana Vishwanathapura
2022-12-13 12:03 ` Matthew Auld [this message]
2023-01-18 6:24 ` Niranjana Vishwanathapura
2022-12-13 0:01 ` [Intel-gfx] ✗ Fi.CI.CHECKPATCH: warning for drm/i915/vm_bind: Add VM_BIND functionality (rev12) Patchwork
2022-12-13 0:01 ` [Intel-gfx] ✗ Fi.CI.SPARSE: " Patchwork
2022-12-13 0:23 ` [Intel-gfx] ✓ Fi.CI.BAT: success " Patchwork
2022-12-13 20:00 ` [Intel-gfx] ✓ Fi.CI.IGT: " Patchwork
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=f684e366-417e-e087-764f-390f65ebd0f9@intel.com \
--to=matthew.auld@intel.com \
--cc=christian.koenig@amd.com \
--cc=daniel.vetter@intel.com \
--cc=dri-devel@lists.freedesktop.org \
--cc=intel-gfx@lists.freedesktop.org \
--cc=jani.nikula@intel.com \
--cc=niranjana.vishwanathapura@intel.com \
--cc=paulo.r.zanoni@intel.com \
--cc=thomas.hellstrom@intel.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox