All of lore.kernel.org
 help / color / mirror / Atom feed
From: Lang Yu <Lang.Yu@amd.com>
To: Matthew Auld <matthew.william.auld@gmail.com>
Cc: amd-gfx@lists.freedesktop.org, dri-devel@lists.freedesktop.org,
	Christian Koenig <christian.koenig@amd.com>
Subject: Re: [PATCH v2] drm/ttm: fix one use-after-free
Date: Wed, 5 Jul 2023 19:20:51 +0800	[thread overview]
Message-ID: <ZKVSEw3nCsUnNUQN@lang-desktop> (raw)
In-Reply-To: <CAM0jSHN7=8Kbnm2oTnSQ4HZd0sv+nf6R5nEfyXdmHedsr-t6rw@mail.gmail.com>

On 07/05/ , Matthew Auld wrote:
> On Wed, 5 Jul 2023 at 11:08, Lang Yu <Lang.Yu@amd.com> wrote:
> >
> > bo->kref is increased once(kref_init()) in ttm_bo_release,
> > but decreased twice(ttm_bo_put()) respectively in
> > ttm_bo_delayed_delete and ttm_bo_cleanup_refs,
> > which is unbalanced.
> >
> > Just clean up bo resource in one place for a delayed deleted bo.
> >
> > Fixes: 9bff18d13473 ("drm/ttm: use per BO cleanup workers")
> >
> > [   67.399887] refcount_t: underflow; use-after-free.
> > [   67.399901] WARNING: CPU: 0 PID: 3172 at lib/refcount.c:28 refcount_warn_saturate+0xc2/0x110
> > [   67.400124] RIP: 0010:refcount_warn_saturate+0xc2/0x110
> > [   67.400173] Call Trace:
> > [   67.400176]  <TASK>
> > [   67.400181]  ttm_mem_evict_first+0x4fe/0x5b0 [ttm]
> > [   67.400216]  ttm_bo_mem_space+0x1e3/0x240 [ttm]
> > [   67.400239]  ttm_bo_validate+0xc7/0x190 [ttm]
> > [   67.400253]  ? ww_mutex_trylock+0x1b1/0x390
> > [   67.400266]  ttm_bo_init_reserved+0x183/0x1c0 [ttm]
> > [   67.400280]  ? __rwlock_init+0x3d/0x70
> > [   67.400292]  amdgpu_bo_create+0x1cd/0x4f0 [amdgpu]
> > [   67.400607]  ? __pfx_amdgpu_bo_user_destroy+0x10/0x10 [amdgpu]
> > [   67.400980]  amdgpu_bo_create_user+0x38/0x70 [amdgpu]
> > [   67.401291]  amdgpu_gem_object_create+0x77/0xb0 [amdgpu]
> > [   67.401641]  ? __pfx_amdgpu_bo_user_destroy+0x10/0x10 [amdgpu]
> > [   67.401958]  amdgpu_amdkfd_gpuvm_alloc_memory_of_gpu+0x228/0xa30 [amdgpu]
> > [   67.402433]  kfd_ioctl_alloc_memory_of_gpu+0x14e/0x390 [amdgpu]
> > [   67.402824]  ? lock_release+0x13f/0x290
> > [   67.402838]  kfd_ioctl+0x1e0/0x640 [amdgpu]
> > [   67.403205]  ? __pfx_kfd_ioctl_alloc_memory_of_gpu+0x10/0x10 [amdgpu]
> > [   67.403579]  ? tomoyo_file_ioctl+0x19/0x20
> > [   67.403590]  __x64_sys_ioctl+0x95/0xd0
> > [   67.403601]  do_syscall_64+0x3b/0x90
> > [   67.403609]  entry_SYSCALL_64_after_hwframe+0x72/0xdc
> >
> > Signed-off-by: Lang Yu <Lang.Yu@amd.com>
> > ---
> >  drivers/gpu/drm/ttm/ttm_bo.c | 89 ++++--------------------------------
> >  1 file changed, 10 insertions(+), 79 deletions(-)
> >
> > diff --git a/drivers/gpu/drm/ttm/ttm_bo.c b/drivers/gpu/drm/ttm/ttm_bo.c
> > index 326a3d13a829..1e073dfb1332 100644
> > --- a/drivers/gpu/drm/ttm/ttm_bo.c
> > +++ b/drivers/gpu/drm/ttm/ttm_bo.c
> > @@ -224,82 +224,6 @@ static void ttm_bo_flush_all_fences(struct ttm_buffer_object *bo)
> >         dma_resv_iter_end(&cursor);
> >  }
> >
> > -/**
> > - * ttm_bo_cleanup_refs
> > - * If bo idle, remove from lru lists, and unref.
> > - * If not idle, block if possible.
> > - *
> > - * Must be called with lru_lock and reservation held, this function
> > - * will drop the lru lock and optionally the reservation lock before returning.
> > - *
> > - * @bo:                    The buffer object to clean-up
> > - * @interruptible:         Any sleeps should occur interruptibly.
> > - * @no_wait_gpu:           Never wait for gpu. Return -EBUSY instead.
> > - * @unlock_resv:           Unlock the reservation lock as well.
> > - */
> > -
> > -static int ttm_bo_cleanup_refs(struct ttm_buffer_object *bo,
> > -                              bool interruptible, bool no_wait_gpu,
> > -                              bool unlock_resv)
> > -{
> > -       struct dma_resv *resv = &bo->base._resv;
> > -       int ret;
> > -
> > -       if (dma_resv_test_signaled(resv, DMA_RESV_USAGE_BOOKKEEP))
> > -               ret = 0;
> > -       else
> > -               ret = -EBUSY;
> > -
> > -       if (ret && !no_wait_gpu) {
> > -               long lret;
> > -
> > -               if (unlock_resv)
> > -                       dma_resv_unlock(bo->base.resv);
> > -               spin_unlock(&bo->bdev->lru_lock);
> > -
> > -               lret = dma_resv_wait_timeout(resv, DMA_RESV_USAGE_BOOKKEEP,
> > -                                            interruptible,
> > -                                            30 * HZ);
> > -
> > -               if (lret < 0)
> > -                       return lret;
> > -               else if (lret == 0)
> > -                       return -EBUSY;
> > -
> > -               spin_lock(&bo->bdev->lru_lock);
> > -               if (unlock_resv && !dma_resv_trylock(bo->base.resv)) {
> > -                       /*
> > -                        * We raced, and lost, someone else holds the reservation now,
> > -                        * and is probably busy in ttm_bo_cleanup_memtype_use.
> > -                        *
> > -                        * Even if it's not the case, because we finished waiting any
> > -                        * delayed destruction would succeed, so just return success
> > -                        * here.
> > -                        */
> > -                       spin_unlock(&bo->bdev->lru_lock);
> > -                       return 0;
> > -               }
> > -               ret = 0;
> > -       }
> > -
> > -       if (ret) {
> > -               if (unlock_resv)
> > -                       dma_resv_unlock(bo->base.resv);
> > -               spin_unlock(&bo->bdev->lru_lock);
> > -               return ret;
> > -       }
> > -
> > -       spin_unlock(&bo->bdev->lru_lock);
> > -       ttm_bo_cleanup_memtype_use(bo);
> > -
> > -       if (unlock_resv)
> > -               dma_resv_unlock(bo->base.resv);
> > -
> > -       ttm_bo_put(bo);
> 
> The put() here is indeed broken and leads to some nasty uaf I think,
> but we fixed that a while back in: c00133a9e87e ("drm/ttm: drop extra
> ttm_bo_put in ttm_bo_cleanup_refs"). It looks like you are using an
> old tree? Perhaps the issue you are seeing was also fixed with that?

Thanks. I can see this commit in my tree but it was overrode by other
patches. It fixed this issue.

Regards,
Lang

> > -
> > -       return 0;
> > -}
> > -
> >  /*
> >   * Block for the dma_resv object to become idle, lock the buffer and clean up
> >   * the resource and tt object.
> > @@ -622,8 +546,10 @@ int ttm_mem_evict_first(struct ttm_device *bdev,
> >         }
> >
> >         if (bo->deleted) {
> > -               ret = ttm_bo_cleanup_refs(bo, ctx->interruptible,
> > -                                         ctx->no_wait_gpu, locked);
> > +               if (locked)
> > +                       dma_resv_unlock(bo->base.resv);
> > +               spin_unlock(&bdev->lru_lock);
> > +               ret = ttm_bo_wait_ctx(bo, ctx);
> >                 ttm_bo_put(bo);
> >                 return ret;
> >         }
> > @@ -1146,7 +1072,12 @@ int ttm_bo_swapout(struct ttm_buffer_object *bo, struct ttm_operation_ctx *ctx,
> >         }
> >
> >         if (bo->deleted) {
> > -               ret = ttm_bo_cleanup_refs(bo, false, false, locked);
> > +               struct ttm_operation_ctx ctx = { false, false };
> > +
> > +               if (locked)
> > +                       dma_resv_unlock(bo->base.resv);
> > +               spin_unlock(&bo->bdev->lru_lock);
> > +               ret = ttm_bo_wait_ctx(bo, &ctx);
> >                 ttm_bo_put(bo);
> >                 return ret == -EBUSY ? -ENOSPC : ret;
> >         }
> > --
> > 2.25.1
> >

      reply	other threads:[~2023-07-05 11:21 UTC|newest]

Thread overview: 3+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2023-07-05 10:07 [PATCH v2] drm/ttm: fix one use-after-free Lang Yu
2023-07-05 10:37 ` Matthew Auld
2023-07-05 11:20   ` Lang Yu [this message]

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=ZKVSEw3nCsUnNUQN@lang-desktop \
    --to=lang.yu@amd.com \
    --cc=amd-gfx@lists.freedesktop.org \
    --cc=christian.koenig@amd.com \
    --cc=dri-devel@lists.freedesktop.org \
    --cc=matthew.william.auld@gmail.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.