Intel-GFX Archive on lore.kernel.org
 help / color / mirror / Atom feed
From: "Christian König" <christian.koenig@amd.com>
To: "Thomas Hellström" <thomas.hellstrom@linux.intel.com>,
	intel-gfx@lists.freedesktop.org, dri-devel@lists.freedesktop.org
Subject: Re: [Intel-gfx] [RFC PATCH] drm/ttm: Fix swapping dereferences of freed memory
Date: Fri, 28 May 2021 16:10:34 +0200	[thread overview]
Message-ID: <37dacfad-b557-b210-02f0-7afa202bac51@amd.com> (raw)
In-Reply-To: <169de7a9af59135d1b63278b3b69a892ecfd4549.camel@linux.intel.com>

Am 28.05.21 um 09:33 schrieb Thomas Hellström:
> On Fri, 2021-05-28 at 09:16 +0200, Christian König wrote:
>> Am 27.05.21 um 17:51 schrieb Thomas Hellström:
>>> On Thu, 2021-05-27 at 17:32 +0200, Christian König wrote:
>>>> Am 27.05.21 um 17:05 schrieb Thomas Hellström:
>>>>> On Thu, 2021-05-27 at 17:01 +0200, Thomas Hellström wrote:
>>>>>> On Thu, 2021-05-27 at 16:54 +0200, Christian König wrote:
>>>>>>> Am 27.05.21 um 16:19 schrieb Thomas Hellström:
>>>>>>>> The swapping code was dereference bo->ttm pointers
>>>>>>>> without
>>>>>>>> having
>>>>>>>> the
>>>>>>>> dma-resv lock held. Also it might try to swap out
>>>>>>>> unpopulated
>>>>>>>> bos.
>>>>>>>>
>>>>>>>> Fix this by moving the bo->ttm dereference until we have
>>>>>>>> the
>>>>>>>> reservation
>>>>>>>> lock. Check that the ttm_tt is populated after the
>>>>>>>> swap_notify
>>>>>>>> callback.
>>>>>>>>
>>>>>>>> Signed-off-by: Thomas Hellström
>>>>>>>> <thomas.hellstrom@linux.intel.com>
>>>>>>>> ---
>>>>>>>>      drivers/gpu/drm/ttm/ttm_bo.c     | 16
>>>>>>>> +++++++++++++++-
>>>>>>>>      drivers/gpu/drm/ttm/ttm_device.c |  8 +++-----
>>>>>>>>      2 files changed, 18 insertions(+), 6 deletions(-)
>>>>>>>>
>>>>>>>> diff --git a/drivers/gpu/drm/ttm/ttm_bo.c
>>>>>>>> b/drivers/gpu/drm/ttm/ttm_bo.c
>>>>>>>> index 9f53506a82fc..86213d37657b 100644
>>>>>>>> --- a/drivers/gpu/drm/ttm/ttm_bo.c
>>>>>>>> +++ b/drivers/gpu/drm/ttm/ttm_bo.c
>>>>>>>> @@ -1163,6 +1163,16 @@ int ttm_bo_swapout(struct
>>>>>>>> ttm_buffer_object
>>>>>>>> *bo, struct ttm_operation_ctx *ctx,
>>>>>>>>            if (!ttm_bo_evict_swapout_allowable(bo, ctx,
>>>>>>>> &place,
>>>>>>>> &locked, NULL))
>>>>>>>>                    return -EBUSY;
>>>>>>>>      
>>>>>>>> +       dma_resv_assert_held(bo->base.resv);
>>>>>>>> +
>>>>>>>> +       if (!bo->ttm ||
>>>>>>>> +           bo->ttm->page_flags & TTM_PAGE_FLAG_SG ||
>>>>>>>> +           bo->ttm->page_flags & TTM_PAGE_FLAG_SWAPPED)
>>>>>>>> {
>>>>>>>> +               if (locked)
>>>>>>>> +                       dma_resv_unlock(bo->base.resv);
>>>>>>>> +               return -EBUSY;
>>>>>>>> +       }
>>>>>>>> +
>>>>>>>>            if (!ttm_bo_get_unless_zero(bo)) {
>>>>>>>>                    if (locked)
>>>>>>>>                            dma_resv_unlock(bo->base.resv);
>>>>>>>> @@ -1215,7 +1225,8 @@ int ttm_bo_swapout(struct
>>>>>>>> ttm_buffer_object
>>>>>>>> *bo, struct ttm_operation_ctx *ctx,
>>>>>>>>            if (bo->bdev->funcs->swap_notify)
>>>>>>>>                    bo->bdev->funcs->swap_notify(bo);
>>>>>>>>      
>>>>>>>> -       ret = ttm_tt_swapout(bo->bdev, bo->ttm,
>>>>>>>> gfp_flags);
>>>>>>>> +       if (ttm_tt_is_populated(bo->ttm))
>>>>>>>> +               ret = ttm_tt_swapout(bo->bdev, bo->ttm,
>>>>>>>> gfp_flags);
>>>>>>> Exactly that is what I won't recommend. We would try to
>>>>>>> swap
>>>>>>> out
>>>>>>> the
>>>>>>> same BO over and over again with that.
>>>>>> But we wouldn't since the BO is taken off the LRU and never
>>>>>> re-
>>>>>> added,
>>>>>>
>>>>>>
>>>>> In fact, we'd probably might want to take the !bo->ttm bos off
>>>>> the
>>>>> LRU
>>>>> as well..
>>>> No, we don't want to take any BOs of the LRU unless they are
>>>> pinned.
>>>>
>>>> Adding a TT object or populating it doesn't necessarily put the
>>>> BO
>>>> back
>>>> to the LRU.
>>> OK, but swapped bos are also taken off the LRU list so these
>>> unpopulated bos are just taking the same path. Only difference to
>>> swapped is that they don't get read back on re-populate, but
>>> typically
>>> cleared.
>>>
>>> But what would be the point of keeping swapped-out bos on the LRU
>>> list?, particularly when we're iterating under a spinlock?
>>> Shouldn't we try to re-add to LRU (if not already on an LRU) just
>>> before populating? There aren't really that many calls in core TTM.
>> I want to avoid removing BOs from the LRU as much as possible since
>> we
>> forgot on multiple places that we want to re-add them.
>>
>> Conceptual I think the swapped BOs should have a separate memory
>> domain,
>> this way we can ignore them cleanly when swapping things out.
> Yes, that would of course work as well. Keeping them on the system LRU
> is IMO highly undesirable.
>
>> Going to pick this patch up, modifying it a bit more and then pushing
>> it
>> to drm-misc-fixes for upstreaming.
> OK, I dropped the TTM fix for the purge-in-swap-notify from the i915
> series, hoping that the reworked variant of this patch lands first.

You will still need to add the second ttm_tt_populated() check since I 
dropped that for the back which I want to push to -fixes.

Regards,
Christian.

>
> Thanks,
> Thomas
>
>> Thanks,
>> Christian.
>>
>>> /Thomas
>>>
>>>
>>>
>>>
>>>
>>>> Christian.
>>>>
>>>>> /Thomas
>>>>>
>

_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

  reply	other threads:[~2021-05-28 14:10 UTC|newest]

Thread overview: 13+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2021-05-27 14:19 [Intel-gfx] [RFC PATCH] drm/ttm: Fix swapping dereferences of freed memory Thomas Hellström
2021-05-27 14:54 ` Christian König
2021-05-27 15:01   ` Thomas Hellström
2021-05-27 15:05     ` Thomas Hellström
2021-05-27 15:32       ` Christian König
2021-05-27 15:51         ` Thomas Hellström
2021-05-28  7:16           ` Christian König
2021-05-28  7:33             ` Thomas Hellström
2021-05-28 14:10               ` Christian König [this message]
2021-05-28 14:17                 ` Thomas Hellström
2021-05-28 14:21                   ` Christian König
2021-05-27 15:17     ` Christian König
2021-05-27 19:58 ` [Intel-gfx] ✗ Fi.CI.BUILD: failure for " Patchwork

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=37dacfad-b557-b210-02f0-7afa202bac51@amd.com \
    --to=christian.koenig@amd.com \
    --cc=dri-devel@lists.freedesktop.org \
    --cc=intel-gfx@lists.freedesktop.org \
    --cc=thomas.hellstrom@linux.intel.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox