Re: [Intel-gfx] [PATCH v3 2/2] drm/i915/ttm: Failsafe migration blits

All of lore.kernel.org
 help / color / mirror / Atom feed

From: "Thomas Hellström" <thomas.hellstrom@linux.intel.com>
To: Matthew Auld <matthew.william.auld@gmail.com>
Cc: Intel Graphics Development <intel-gfx@lists.freedesktop.org>,
	Matthew Auld <matthew.auld@intel.com>,
	ML dri-devel <dri-devel@lists.freedesktop.org>
Subject: Re: [Intel-gfx] [PATCH v3 2/2] drm/i915/ttm: Failsafe migration blits
Date: Tue, 2 Nov 2021 18:55:10 +0100	[thread overview]
Message-ID: <d9c63560-86fa-9efd-c962-eddb8cf76757@linux.intel.com> (raw)
In-Reply-To: <CAM0jSHO-ZUP39Rw=Nm=wUrNEhCzhVG_ve+3we8gWzSkoisn4yQ@mail.gmail.com>


On 11/2/21 18:40, Matthew Auld wrote:
> On Tue, 2 Nov 2021 at 16:39, Thomas Hellström
> <thomas.hellstrom@linux.intel.com> wrote:
>> If the initial fill blit or copy blit of an object fails, the old
>> content of the data might be exposed and read as soon as either CPU- or
>> GPU PTEs are set up to point at the pages.
>>
>> Intercept the blit fence with an async callback that checks the
>> blit fence for errors and if there are errors performs an async cpu blit
>> instead. If there is a failure to allocate the async dma_fence_work,
>> allocate it on the stack and sync wait for the blit to complete.
>>
>> Add selftests that simulate gpu blit failures and failure to allocate
>> the async dma_fence_work.
>>
>> A previous version of this pach used dma_fence_work, now that's
>> opencoded which adds more code but might lower the latency
>> somewhat in the common non-error case.
>>
>> v3:
>> - Style fixes (Matthew Auld)
>>
>> Signed-off-by: Thomas Hellström <thomas.hellstrom@linux.intel.com>
>> Reviewed-by: Matthew Auld <matthew.auld@intel.com>
>> ---
>>   drivers/gpu/drm/i915/gem/i915_gem_ttm_move.c  | 322 +++++++++++++++---
>>   drivers/gpu/drm/i915/gem/i915_gem_ttm_move.h  |   5 +
>>   .../drm/i915/gem/selftests/i915_gem_migrate.c |  24 +-
>>   3 files changed, 295 insertions(+), 56 deletions(-)
>>
>> diff --git a/drivers/gpu/drm/i915/gem/i915_gem_ttm_move.c b/drivers/gpu/drm/i915/gem/i915_gem_ttm_move.c
>> index 0ed6b7f2b95f..b89672c547f8 100644
>> --- a/drivers/gpu/drm/i915/gem/i915_gem_ttm_move.c
>> +++ b/drivers/gpu/drm/i915/gem/i915_gem_ttm_move.c
>> @@ -18,6 +18,29 @@
>>   #include "gt/intel_gt.h"
>>   #include "gt/intel_migrate.h"
>>
>> +/**
>> + * DOC: Selftest failure modes for failsafe migration:
>> + *
>> + * For fail_gpu_migration, the gpu blit scheduled is always a clear blit
>> + * rather than a copy blit, and then we force the failure paths as if
>> + * the blit fence returned an error.
>> + *
>> + * For fail_work_allocation we fail the kmalloc of the async worker, we
>> + * sync the gpu blit. If it then fails, or fail_gpu_migration is set to
>> + * true, then a memcpy operation is performed sync.
>> + */
>> +#ifdef CONFIG_DRM_I915_SELFTEST
> When pushing maybe make this:
>
> #if IS_ENABLED(CONFIG_DRM_I915_SELFTEST)
>
> Which seems to be consistent with most of the other places.

Hmm,

I noticed that i915 is doing that, although I thought these macros were 
primarily intended for C expressions?

/Thomas


>
>> +static bool fail_gpu_migration;
>> +static bool fail_work_allocation;
>> +
>> +void i915_ttm_migrate_set_failure_modes(bool gpu_migration,
>> +                                       bool work_allocation)
>> +{
>> +       fail_gpu_migration = gpu_migration;
>> +       fail_work_allocation = work_allocation;
>> +}
>> +#endif
>> +

WARNING: multiple messages have this Message-ID (diff)

From: "Thomas Hellström" <thomas.hellstrom@linux.intel.com>
To: Matthew Auld <matthew.william.auld@gmail.com>
Cc: Intel Graphics Development <intel-gfx@lists.freedesktop.org>,
	Matthew Auld <matthew.auld@intel.com>,
	ML dri-devel <dri-devel@lists.freedesktop.org>
Subject: Re: [PATCH v3 2/2] drm/i915/ttm: Failsafe migration blits
Date: Tue, 2 Nov 2021 18:55:10 +0100	[thread overview]
Message-ID: <d9c63560-86fa-9efd-c962-eddb8cf76757@linux.intel.com> (raw)
In-Reply-To: <CAM0jSHO-ZUP39Rw=Nm=wUrNEhCzhVG_ve+3we8gWzSkoisn4yQ@mail.gmail.com>


On 11/2/21 18:40, Matthew Auld wrote:
> On Tue, 2 Nov 2021 at 16:39, Thomas Hellström
> <thomas.hellstrom@linux.intel.com> wrote:
>> If the initial fill blit or copy blit of an object fails, the old
>> content of the data might be exposed and read as soon as either CPU- or
>> GPU PTEs are set up to point at the pages.
>>
>> Intercept the blit fence with an async callback that checks the
>> blit fence for errors and if there are errors performs an async cpu blit
>> instead. If there is a failure to allocate the async dma_fence_work,
>> allocate it on the stack and sync wait for the blit to complete.
>>
>> Add selftests that simulate gpu blit failures and failure to allocate
>> the async dma_fence_work.
>>
>> A previous version of this pach used dma_fence_work, now that's
>> opencoded which adds more code but might lower the latency
>> somewhat in the common non-error case.
>>
>> v3:
>> - Style fixes (Matthew Auld)
>>
>> Signed-off-by: Thomas Hellström <thomas.hellstrom@linux.intel.com>
>> Reviewed-by: Matthew Auld <matthew.auld@intel.com>
>> ---
>>   drivers/gpu/drm/i915/gem/i915_gem_ttm_move.c  | 322 +++++++++++++++---
>>   drivers/gpu/drm/i915/gem/i915_gem_ttm_move.h  |   5 +
>>   .../drm/i915/gem/selftests/i915_gem_migrate.c |  24 +-
>>   3 files changed, 295 insertions(+), 56 deletions(-)
>>
>> diff --git a/drivers/gpu/drm/i915/gem/i915_gem_ttm_move.c b/drivers/gpu/drm/i915/gem/i915_gem_ttm_move.c
>> index 0ed6b7f2b95f..b89672c547f8 100644
>> --- a/drivers/gpu/drm/i915/gem/i915_gem_ttm_move.c
>> +++ b/drivers/gpu/drm/i915/gem/i915_gem_ttm_move.c
>> @@ -18,6 +18,29 @@
>>   #include "gt/intel_gt.h"
>>   #include "gt/intel_migrate.h"
>>
>> +/**
>> + * DOC: Selftest failure modes for failsafe migration:
>> + *
>> + * For fail_gpu_migration, the gpu blit scheduled is always a clear blit
>> + * rather than a copy blit, and then we force the failure paths as if
>> + * the blit fence returned an error.
>> + *
>> + * For fail_work_allocation we fail the kmalloc of the async worker, we
>> + * sync the gpu blit. If it then fails, or fail_gpu_migration is set to
>> + * true, then a memcpy operation is performed sync.
>> + */
>> +#ifdef CONFIG_DRM_I915_SELFTEST
> When pushing maybe make this:
>
> #if IS_ENABLED(CONFIG_DRM_I915_SELFTEST)
>
> Which seems to be consistent with most of the other places.

Hmm,

I noticed that i915 is doing that, although I thought these macros were 
primarily intended for C expressions?

/Thomas


>
>> +static bool fail_gpu_migration;
>> +static bool fail_work_allocation;
>> +
>> +void i915_ttm_migrate_set_failure_modes(bool gpu_migration,
>> +                                       bool work_allocation)
>> +{
>> +       fail_gpu_migration = gpu_migration;
>> +       fail_work_allocation = work_allocation;
>> +}
>> +#endif
>> +

next prev parent reply	other threads:[~2021-11-02 17:55 UTC|newest]

Thread overview: 15+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2021-11-02 16:34 [Intel-gfx] [PATCH v3 0/2] drm/i915: Failsafe migration blits Thomas Hellström
2021-11-02 16:34 ` Thomas Hellström
2021-11-02 16:34 ` [Intel-gfx] [PATCH v3 1/2] drm/i915/ttm: Reorganize the ttm move code Thomas Hellström
2021-11-02 16:34   ` Thomas Hellström
2021-11-02 16:34 ` [Intel-gfx] [PATCH v3 2/2] drm/i915/ttm: Failsafe migration blits Thomas Hellström
2021-11-02 16:34   ` Thomas Hellström
2021-11-02 17:40   ` [Intel-gfx] " Matthew Auld
2021-11-02 17:40     ` Matthew Auld
2021-11-02 17:55     ` Thomas Hellström [this message]
2021-11-02 17:55       ` Thomas Hellström
2021-11-02 18:58       ` [Intel-gfx] " Matthew Auld
2021-11-02 18:58         ` Matthew Auld
2021-11-02 18:43 ` [Intel-gfx] ✗ Fi.CI.CHECKPATCH: warning for drm/i915: Failsafe migration blits (rev4) Patchwork
2021-11-02 19:13 ` [Intel-gfx] ✓ Fi.CI.BAT: success " Patchwork
2021-11-02 22:07 ` [Intel-gfx] ✓ Fi.CI.IGT: " Patchwork

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=d9c63560-86fa-9efd-c962-eddb8cf76757@linux.intel.com \
    --to=thomas.hellstrom@linux.intel.com \
    --cc=dri-devel@lists.freedesktop.org \
    --cc=intel-gfx@lists.freedesktop.org \
    --cc=matthew.auld@intel.com \
    --cc=matthew.william.auld@gmail.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link

Be sure your reply has a Subject: header at the top and a blank line before the message body.

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.