Intel-XE Archive on lore.kernel.org
 help / color / mirror / Atom feed
From: "Christian König" <christian.koenig@amd.com>
To: Lucas De Marchi <lucas.demarchi@intel.com>,
	Nirmoy Das <nirmoy.das@linux.intel.com>
Cc: Shuicheng Lin <shuicheng.lin@intel.com>,
	intel-xe@lists.freedesktop.org,
	Matthew Brost <matthew.brost@intel.com>,
	Thomas Hellstrom <thomas.hellstrom@intel.com>,
	dri-devel@lists.freedesktop.org
Subject: Re: [PATCH] drm/xe: Fix UBSAN shift-out-of-bounds failure
Date: Tue, 7 May 2024 15:23:10 +0200	[thread overview]
Message-ID: <24d4a9a9-c622-4f56-8672-21f4c6785476@amd.com> (raw)
In-Reply-To: <fs2aq6wgrsoilkk4spw7fbmvxnbovn555qbney6aqwetflvg75@q4zsc2l2v64s>

Am 07.05.24 um 15:18 schrieb Lucas De Marchi:
>
> +Thomas, +Christian, +dri-devel
>
> On Tue, May 07, 2024 at 11:42:46AM GMT, Nirmoy Das wrote:
>>
>> On 5/7/2024 11:39 AM, Nirmoy Das wrote:
>>>
>>>
>>> On 5/7/2024 10:04 AM, Shuicheng Lin wrote:
>>>> Here is the failure stack:
>>>> [   12.988209] ------------[ cut here ]------------
>>>> [   12.988216] UBSAN: shift-out-of-bounds in 
>>>> ./include/linux/log2.h:57:13
>>>> [   12.988232] shift exponent 64 is too large for 64-bit type 'long 
>>>> unsigned int'
>>>> [   12.988235] CPU: 4 PID: 1310 Comm: gnome-shell Tainted: G     
>>>> U             6.9.0-rc6+prerelease1158+ #19
>>>> [   12.988237] Hardware name: Intel Corporation Raptor Lake Client 
>>>> Platform/RPL-S ADP-S DDR5 UDIMM CRB, BIOS 
>>>> RPLSFWI1.R00.3301.A02.2208050712 08/05/2022
>>>> [   12.988239] Call Trace:
>>>> [   12.988240]  <TASK>
>>>> [   12.988242]  dump_stack_lvl+0xd7/0xf0
>>>> [   12.988248]  dump_stack+0x10/0x20
>>>> [   12.988250]  ubsan_epilogue+0x9/0x40
>>>> [   12.988253] __ubsan_handle_shift_out_of_bounds+0x10e/0x170
>>>> [   12.988260]  dma_resv_reserve_fences.cold+0x2b/0x48
>>>> [   12.988262]  ? ww_mutex_lock_interruptible+0x3c/0x110
>>>> [   12.988267]  drm_exec_prepare_obj+0x45/0x60 [drm_exec]
>>>> [   12.988271]  ? vm_bind_ioctl_ops_execute+0x5b/0x740 [xe]
>>>> [   12.988345]  vm_bind_ioctl_ops_execute+0x78/0x740 [xe]
>>>>
>>>> It is caused by the value 0 of parameter num_fences in function 
>>>> drm_exec_prepare_obj.
>>>> And lead to in function __rounddown_pow_of_two, "0 - 1" causes the 
>>>> shift-out-of-bounds.
>>>> For the num_fences, it should be 1 at least.
>>>>
>>>> Cc: Matthew Brost<matthew.brost@intel.com>
>>>> Signed-off-by: Shuicheng Lin<shuicheng.lin@intel.com>
>>>> ---
>>>>  drivers/gpu/drm/xe/xe_vm.c | 4 ++--
>>>>  1 file changed, 2 insertions(+), 2 deletions(-)
>>>>
>>>> diff --git a/drivers/gpu/drm/xe/xe_vm.c b/drivers/gpu/drm/xe/xe_vm.c
>>>> index d17192c8b7de..96cb4d9762a3 100644
>>>> --- a/drivers/gpu/drm/xe/xe_vm.c
>>>> +++ b/drivers/gpu/drm/xe/xe_vm.c
>>>> @@ -2692,7 +2692,7 @@ static int vma_lock_and_validate(struct 
>>>> drm_exec *exec, struct xe_vma *vma,
>>>>      if (bo) {
>>>>          if (!bo->vm)
>>>> -            err = drm_exec_prepare_obj(exec, &bo->ttm.base, 0);
>>>> +            err = drm_exec_prepare_obj(exec, &bo->ttm.base, 1);
>>>
>>> This needs to be fixed in drm_exec_prepare_obj() by checking 
>>> num_fences and not calling dma_resv_reserve_fences()
>>>
>> or just call drm_exec_lock_obj() here. ref: 
>> https://patchwork.freedesktop.org/patch/577487/
>
> we are hit again by this. Couldn't we change drm_exec_prepare_obj() to
> check num_fences and if is 0 just fallback to just do
> drm_exec_lock_obj() as  "the least amount of work needed in this case"?

No, and that reminds me (again!) that I wanted to add a WARN_ON for this.

If you don't need a fence slot in the first place then you should only 
use drm_exec_lock_obj() instead of drm_exec_prepare_obj().

If you dynamically calculate the number of fence slots needed and end up 
with zero then there is most likely something wrong with your calculation.

That was intentionally made like this because we ended up with quite 
some bugs around that.

Regards,
Christian.

>
> Something like this:
>
> | diff --git a/drivers/gpu/drm/drm_exec.c b/drivers/gpu/drm/drm_exec.c
> | index 2da094bdf8a4..68b5f6210b09 100644
> | --- a/drivers/gpu/drm/drm_exec.c
> | +++ b/drivers/gpu/drm/drm_exec.c
> | @@ -296,10 +296,12 @@ int drm_exec_prepare_obj(struct drm_exec 
> *exec, struct drm_gem_object *obj,
> |      if (ret)
> |          return ret;
> |  | -    ret = dma_resv_reserve_fences(obj->resv, num_fences);
> | -    if (ret) {
> | -        drm_exec_unlock_obj(exec, obj);
> | -        return ret;
> | +    if (num_fences) {
> | +        ret = dma_resv_reserve_fences(obj->resv, num_fences);
> | +        if (ret) {
> | +            drm_exec_unlock_obj(exec, obj);
> | +            return ret;
> | +        }
> |      }
> |  |      return 0;
>
> thanks
> Lucas De Marchi
>
>>
>> Nirmoy
>>
>>>
>>> Regards,
>>>
>>> Nirmoy
>>>
>>>>          if (!err && validate)
>>>>              err = xe_bo_validate(bo, xe_vma_vm(vma), true);
>>>>      }
>>>> @@ -2777,7 +2777,7 @@ static int 
>>>> vm_bind_ioctl_ops_lock_and_prep(struct drm_exec *exec,
>>>>      struct xe_vma_op *op;
>>>>      int err;
>>>> -    err = drm_exec_prepare_obj(exec, xe_vm_obj(vm), 0);
>>>> +    err = drm_exec_prepare_obj(exec, xe_vm_obj(vm), 1);
>>>>      if (err)
>>>>          return err;


  reply	other threads:[~2024-05-07 13:23 UTC|newest]

Thread overview: 28+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2024-05-07  8:04 [PATCH] drm/xe: Fix UBSAN shift-out-of-bounds failure Shuicheng Lin
2024-05-07  9:39 ` Nirmoy Das
2024-05-07  9:42   ` Nirmoy Das
2024-05-07 13:18     ` Lucas De Marchi
2024-05-07 13:23       ` Christian König [this message]
2024-05-07 10:08 ` ✓ CI.Patch_applied: success for " Patchwork
2024-05-07 10:08 ` ✗ CI.checkpatch: warning " Patchwork
2024-05-07 10:09 ` ✓ CI.KUnit: success " Patchwork
2024-05-07 10:21 ` ✓ CI.Build: " Patchwork
2024-05-07 10:23 ` ✓ CI.Hooks: " Patchwork
2024-05-07 10:25 ` ✓ CI.checksparse: " Patchwork
2024-05-07 10:55 ` ✓ CI.BAT: " Patchwork
2024-05-07 13:04 ` [PATCH] " Shuicheng Lin
2024-05-07 15:13   ` Nirmoy Das
2024-05-09  0:25     ` Lin, Shuicheng
2024-05-09  3:39       ` Lucas De Marchi
2024-05-09  3:45         ` Lin, Shuicheng
2024-05-09  4:50           ` Lucas De Marchi
2024-05-09  3:38   ` Lucas De Marchi
2024-05-07 13:25 ` ✓ CI.Patch_applied: success for drm/xe: Fix UBSAN shift-out-of-bounds failure (rev2) Patchwork
2024-05-07 13:25 ` ✗ CI.checkpatch: warning " Patchwork
2024-05-07 13:26 ` ✓ CI.KUnit: success " Patchwork
2024-05-07 13:30 ` ✗ CI.FULL: failure for drm/xe: Fix UBSAN shift-out-of-bounds failure Patchwork
2024-05-07 13:38 ` ✓ CI.Build: success for drm/xe: Fix UBSAN shift-out-of-bounds failure (rev2) Patchwork
2024-05-07 13:41 ` ✓ CI.Hooks: " Patchwork
2024-05-07 13:43 ` ✓ CI.checksparse: " Patchwork
2024-05-07 14:17 ` ✓ CI.BAT: " Patchwork
2024-05-07 19:23 ` ✗ CI.FULL: failure " Patchwork

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=24d4a9a9-c622-4f56-8672-21f4c6785476@amd.com \
    --to=christian.koenig@amd.com \
    --cc=dri-devel@lists.freedesktop.org \
    --cc=intel-xe@lists.freedesktop.org \
    --cc=lucas.demarchi@intel.com \
    --cc=matthew.brost@intel.com \
    --cc=nirmoy.das@linux.intel.com \
    --cc=shuicheng.lin@intel.com \
    --cc=thomas.hellstrom@intel.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox