All of lore.kernel.org
 help / color / mirror / Atom feed
From: Rodrigo Vivi <rodrigo.vivi@intel.com>
To: Lucas De Marchi <lucas.demarchi@intel.com>
Cc: Stuart Summers <stuart.summers@intel.com>,
	Vivi@freedesktop.org, "Chang, Bruce" <yu.bruce.chang@intel.com>,
	intel-xe@lists.freedesktop.org
Subject: Re: [Intel-xe] [PATCH] drm/xe: fix pvc unload issue
Date: Tue, 4 Apr 2023 15:45:05 -0400	[thread overview]
Message-ID: <ZCx+Qfwr7KSaKGiw@intel.com> (raw)
In-Reply-To: <20230404184839.qlbwgv7lvvbi76i4@ldmartin-desk2.lan>

On Tue, Apr 04, 2023 at 11:48:39AM -0700, Lucas De Marchi wrote:
> On Mon, Apr 03, 2023 at 10:20:31PM +0000, Chang, Bruce wrote:
> > Currently, unload pvc driver will generate a null dereference
> > and the call stack is as below.
> > 
> > [ 4850.618000] Call Trace:
> > [ 4850.620740]  <TASK>
> > [ 4850.623134]  ttm_bo_cleanup_memtype_use+0x3f/0x50 [ttm]
> > [ 4850.628661]  ttm_bo_release+0x154/0x2c0 [ttm]
> > [ 4850.633317]  ? drm_buddy_fini+0x62/0x80 [drm_buddy]
> > [ 4850.638487]  ? __kmem_cache_free+0x27d/0x2c0
> > [ 4850.643054]  ttm_bo_put+0x38/0x60 [ttm]
> > [ 4850.647190]  xe_gem_object_free+0x1f/0x30 [xe]
> > [ 4850.651945]  drm_gem_object_free+0x1e/0x30 [drm]
> > [ 4850.656904]  ggtt_fini_noalloc+0x9d/0xe0 [xe]
> > [ 4850.661574]  drm_managed_release+0xb5/0x150 [drm]
> > [ 4850.666617]  drm_dev_release+0x30/0x50 [drm]
> > [ 4850.671209]  devm_drm_dev_init_release+0x3c/0x60 [drm]
> > 
> > There are a couple issues, but the main one is due to TTM has only
> > one TTM_PL_TT region, but since pvc has 2 tiles and tries to setup
> > 1 TTM_PL_TT each tile. The second will overwrite the first one.
> > 
> > During unload time, the first tile will reset the TTM_PL_TT manger
> > and when the second tile is trying to free Bo and it will generate
> > the null reference since the TTM manage is already got reset to 0.
> > 
> > The fix is to use one global TTM_PL_TT manager.
> > 
> > v2: make gtt mgr global and change the name to sys_mgr
> > 
> > Cc: Stuart Summers <stuart.summers@intel.com>
> > Cc: Matthew Brost <matthew.brost@intel.com>
> > Cc: Vivi, Rodrigo <rodrigo.vivi@intel.com>
> > Signed-off-by: Bruce Chang <yu.bruce.chang@intel.com>
> > ---
> > drivers/gpu/drm/xe/Makefile                   |  2 +-
> > drivers/gpu/drm/xe/xe_device.c                |  2 +
> > drivers/gpu/drm/xe/xe_device.h                |  1 +
> > drivers/gpu/drm/xe/xe_device_types.h          |  2 +
> > drivers/gpu/drm/xe/xe_gt.c                    | 18 -----
> > drivers/gpu/drm/xe/xe_gt_types.h              |  2 -
> > drivers/gpu/drm/xe/xe_ttm_gtt_mgr.h           | 16 -----
> > drivers/gpu/drm/xe/xe_ttm_gtt_mgr_types.h     | 18 -----
> > .../xe/{xe_ttm_gtt_mgr.c => xe_ttm_sys_mgr.c} | 67 +++++++------------
> > 9 files changed, 31 insertions(+), 97 deletions(-)
> > delete mode 100644 drivers/gpu/drm/xe/xe_ttm_gtt_mgr.h
> > delete mode 100644 drivers/gpu/drm/xe/xe_ttm_gtt_mgr_types.h
> > rename drivers/gpu/drm/xe/{xe_ttm_gtt_mgr.c => xe_ttm_sys_mgr.c} (52%)
> > 
> > diff --git a/drivers/gpu/drm/xe/Makefile b/drivers/gpu/drm/xe/Makefile
> > index 669c7b6f552c..7e19a15af3dc 100644
> > --- a/drivers/gpu/drm/xe/Makefile
> > +++ b/drivers/gpu/drm/xe/Makefile
> > @@ -83,7 +83,7 @@ xe-y += xe_bb.o \
> > 	xe_step.o \
> > 	xe_sync.o \
> > 	xe_trace.o \
> > -	xe_ttm_gtt_mgr.o \
> > +	xe_ttm_sys_mgr.o \
> > 	xe_ttm_stolen_mgr.o \
> > 	xe_ttm_vram_mgr.o \
> > 	xe_tuning.o \
> > diff --git a/drivers/gpu/drm/xe/xe_device.c b/drivers/gpu/drm/xe/xe_device.c
> > index a79f934e3d2d..4fb9aff27686 100644
> > --- a/drivers/gpu/drm/xe/xe_device.c
> > +++ b/drivers/gpu/drm/xe/xe_device.c
> > @@ -279,6 +279,8 @@ int xe_device_probe(struct xe_device *xe)
> > 	if (err)
> > 		goto err_irq_shutdown;
> > 
> > +	xe_ttm_sys_mgr_init(xe);
> > +
> > 	for_each_gt(gt, xe, id) {
> > 		err = xe_gt_init_noalloc(gt);
> > 		if (err)
> > diff --git a/drivers/gpu/drm/xe/xe_device.h b/drivers/gpu/drm/xe/xe_device.h
> > index d277f8985f7b..d9d1b09a8e38 100644
> > --- a/drivers/gpu/drm/xe/xe_device.h
> > +++ b/drivers/gpu/drm/xe/xe_device.h
> > @@ -116,4 +116,5 @@ static inline bool xe_device_has_flat_ccs(struct xe_device *xe)
> > }
> > 
> > u32 xe_device_ccs_bytes(struct xe_device *xe, u64 size);
> > +int xe_ttm_sys_mgr_init(struct xe_device *xe);
> 
>   CC [M]  drivers/gpu/drm/xe/tests/xe_bo_test.o
> ../drivers/gpu/drm/xe/xe_ttm_sys_mgr.c:99:5: error: no previous prototype for ‘xe_ttm_sys_mgr_init’ [-Werror=missing-prototypes]
>    99 | int xe_ttm_sys_mgr_init(struct xe_device *xe)
>       |     ^~~~~~~~~~~~~~~~~~~
> 

that's awkward... I didn't get issue here, not CI got it...

> 
> couple of issues:
> 
> 1) you moved the implementation to a new file and didn't include
> xe_device.h, creating the build issue above

I was wondering about gcc version, but this wouldn't be magic in older gcc...

> 2) The decl shouldn't be here though. You need a xe_ttm_sys_mgr.h, just
> like we have xe_ttm_stolen_mgr.h, xe_ttm_vram_mgr.h, xe_ttm_vram_mgr_types.h

yeap, that sounds the way to go...

> 
> Lucas De Marchi

  reply	other threads:[~2023-04-04 19:45 UTC|newest]

Thread overview: 13+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2023-04-03 22:20 [Intel-xe] [PATCH] drm/xe: fix pvc unload issue Chang, Bruce
2023-04-03 22:31 ` [Intel-xe] ✓ CI.Patch_applied: success for drm/xe: fix pvc unload issue (rev2) Patchwork
2023-04-03 22:32 ` [Intel-xe] ✓ CI.KUnit: " Patchwork
2023-04-03 22:36 ` [Intel-xe] ✓ CI.Build: " Patchwork
2023-04-03 22:57 ` [Intel-xe] ○ CI.BAT: info " Patchwork
2023-04-04  3:44 ` [Intel-xe] [PATCH] drm/xe: fix pvc unload issue Matthew Brost
2023-04-04 16:29   ` Rodrigo Vivi
2023-04-04 18:48 ` Lucas De Marchi
2023-04-04 19:45   ` Rodrigo Vivi [this message]
2023-04-04 20:22     ` Lucas De Marchi
  -- strict thread matches above, loose matches on Subject: below --
2023-03-30 15:40 Chang, Bruce
2023-03-31  5:40 ` Matthew Auld
2023-03-31 22:55   ` Chang, Yu bruce

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=ZCx+Qfwr7KSaKGiw@intel.com \
    --to=rodrigo.vivi@intel.com \
    --cc=Vivi@freedesktop.org \
    --cc=intel-xe@lists.freedesktop.org \
    --cc=lucas.demarchi@intel.com \
    --cc=stuart.summers@intel.com \
    --cc=yu.bruce.chang@intel.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.