* [PATCH] drm/xe: Skip TLB invalidation time out log if ct is disabled @ 2024-02-20 2:13 Shuicheng Lin 2024-02-20 2:19 ` ✗ CI.Patch_applied: failure for " Patchwork 2024-02-20 15:05 ` [PATCH] " Matthew Brost 0 siblings, 2 replies; 6+ messages in thread From: Shuicheng Lin @ 2024-02-20 2:13 UTC (permalink / raw) To: intel-xe; +Cc: Shuicheng Lin Suspend may cause the TLB invalidation time out as below log. Skip the log print if ct is disabled to make log clean. " [ 228.812266] xe_gt_tlb_invalidation_wait enter [ 228.812311] xe_gt_suspend enter [ 228.812782] xe 0000:03:00.0: [drm] GT0: suspended [ 228.812786] xe_gt_suspend enter [ 228.813508] xe 0000:03:00.0: [drm] GT1: suspended … [ 229.067007] xe 0000:03:00.0: [drm] *ERROR* TILE0 [GTT] GT0: TLB invalidation time'd out, seqno=321, recv=319 [ 229.067099] xe 0000:03:00.0: [drm] *ERROR* GT0: CT disabled " Signed-off-by: Shuicheng Lin <shuicheng.lin@intel.com> --- drivers/gpu/drm/xe/xe_gt_tlb_invalidation.c | 17 ++++++++++++----- 1 file changed, 12 insertions(+), 5 deletions(-) diff --git a/drivers/gpu/drm/xe/xe_gt_tlb_invalidation.c b/drivers/gpu/drm/xe/xe_gt_tlb_invalidation.c index 7b3a54748b49..8aac12efea84 100644 --- a/drivers/gpu/drm/xe/xe_gt_tlb_invalidation.c +++ b/drivers/gpu/drm/xe/xe_gt_tlb_invalidation.c @@ -330,11 +330,18 @@ int xe_gt_tlb_invalidation_wait(struct xe_gt *gt, int seqno) if (!ret) { struct drm_printer p = xe_gt_err_printer(gt); - xe_tile_report_driver_error(gt_to_tile(gt), XE_TILE_DRV_ERR_GTT, - "GT%u: TLB invalidation time'd out, seqno=%d, recv=%d", - gt->info.id, seqno, gt->tlb_invalidation.seqno_recv); - xe_guc_ct_print(&guc->ct, &p, true); - return -ETIME; + /* + * guc ct may be disabled during the waiting period and lead to the timeout. + * Such as power suspend just after this tlb invalidation wait. + * Skip the error log print if ct is disabled. + */ + if (xe_guc_ct_enabled(&guc->ct)) { + xe_tile_report_driver_error(gt_to_tile(gt), XE_TILE_DRV_ERR_GTT, + "GT%u: TLB invalidation time'd out, seqno=%d, recv=%d", + gt->info.id, seqno, gt->tlb_invalidation.seqno_recv); + xe_guc_ct_print(&guc->ct, &p, true); + return -ETIME; + } } return 0; -- 2.25.1 ^ permalink raw reply related [flat|nested] 6+ messages in thread
* ✗ CI.Patch_applied: failure for drm/xe: Skip TLB invalidation time out log if ct is disabled 2024-02-20 2:13 [PATCH] drm/xe: Skip TLB invalidation time out log if ct is disabled Shuicheng Lin @ 2024-02-20 2:19 ` Patchwork 2024-02-20 15:05 ` [PATCH] " Matthew Brost 1 sibling, 0 replies; 6+ messages in thread From: Patchwork @ 2024-02-20 2:19 UTC (permalink / raw) To: Shuicheng Lin; +Cc: intel-xe == Series Details == Series: drm/xe: Skip TLB invalidation time out log if ct is disabled URL : https://patchwork.freedesktop.org/series/130101/ State : failure == Summary == === Applying kernel patches on branch 'drm-tip' with base: === Base commit: faa7ecc2f307 drm-tip: 2024y-02m-19d-20h-31m-40s UTC integration manifest === git am output follows === error: patch failed: drivers/gpu/drm/xe/xe_gt_tlb_invalidation.c:330 error: drivers/gpu/drm/xe/xe_gt_tlb_invalidation.c: patch does not apply hint: Use 'git am --show-current-patch=diff' to see the failed patch Applying: drm/xe: Skip TLB invalidation time out log if ct is disabled Patch failed at 0001 drm/xe: Skip TLB invalidation time out log if ct is disabled When you have resolved this problem, run "git am --continue". If you prefer to skip this patch, run "git am --skip" instead. To restore the original branch and stop patching, run "git am --abort". ^ permalink raw reply [flat|nested] 6+ messages in thread
* Re: [PATCH] drm/xe: Skip TLB invalidation time out log if ct is disabled 2024-02-20 2:13 [PATCH] drm/xe: Skip TLB invalidation time out log if ct is disabled Shuicheng Lin 2024-02-20 2:19 ` ✗ CI.Patch_applied: failure for " Patchwork @ 2024-02-20 15:05 ` Matthew Brost 2024-02-20 18:07 ` Matthew Auld 1 sibling, 1 reply; 6+ messages in thread From: Matthew Brost @ 2024-02-20 15:05 UTC (permalink / raw) To: Shuicheng Lin; +Cc: intel-xe On Tue, Feb 20, 2024 at 02:13:56AM +0000, Shuicheng Lin wrote: > Suspend may cause the TLB invalidation time out as below log. > Skip the log print if ct is disabled to make log clean. > " > [ 228.812266] xe_gt_tlb_invalidation_wait enter > [ 228.812311] xe_gt_suspend enter > [ 228.812782] xe 0000:03:00.0: [drm] GT0: suspended > [ 228.812786] xe_gt_suspend enter > [ 228.813508] xe 0000:03:00.0: [drm] GT1: suspended > … > [ 229.067007] xe 0000:03:00.0: [drm] *ERROR* TILE0 [GTT] GT0: TLB invalidation time'd out, seqno=321, recv=319 > [ 229.067099] xe 0000:03:00.0: [drm] *ERROR* GT0: CT disabled > " > This doesn't look right for a few reasons. - The timeout still can race suspend and then a resume - The xe_guc_ct_enabled check also supresses the -ETIME return - I think this message it actually valid What should probably be done is signal all pending TLB invalidations on suspend. I think we are doing a bit of rework in [1] in this area too. I'd say let's get [1] to land and if this is still an issue fixup the suspend path to signal all TLB invalidation waiters. Signaling all waiters on suspend shoud avoid having this message be printed. Matt [1] https://patchwork.freedesktop.org/series/129217/ > Signed-off-by: Shuicheng Lin <shuicheng.lin@intel.com> > --- > drivers/gpu/drm/xe/xe_gt_tlb_invalidation.c | 17 ++++++++++++----- > 1 file changed, 12 insertions(+), 5 deletions(-) > > diff --git a/drivers/gpu/drm/xe/xe_gt_tlb_invalidation.c b/drivers/gpu/drm/xe/xe_gt_tlb_invalidation.c > index 7b3a54748b49..8aac12efea84 100644 > --- a/drivers/gpu/drm/xe/xe_gt_tlb_invalidation.c > +++ b/drivers/gpu/drm/xe/xe_gt_tlb_invalidation.c > @@ -330,11 +330,18 @@ int xe_gt_tlb_invalidation_wait(struct xe_gt *gt, int seqno) > if (!ret) { > struct drm_printer p = xe_gt_err_printer(gt); > > - xe_tile_report_driver_error(gt_to_tile(gt), XE_TILE_DRV_ERR_GTT, > - "GT%u: TLB invalidation time'd out, seqno=%d, recv=%d", > - gt->info.id, seqno, gt->tlb_invalidation.seqno_recv); > - xe_guc_ct_print(&guc->ct, &p, true); > - return -ETIME; > + /* > + * guc ct may be disabled during the waiting period and lead to the timeout. > + * Such as power suspend just after this tlb invalidation wait. > + * Skip the error log print if ct is disabled. > + */ > + if (xe_guc_ct_enabled(&guc->ct)) { > + xe_tile_report_driver_error(gt_to_tile(gt), XE_TILE_DRV_ERR_GTT, > + "GT%u: TLB invalidation time'd out, seqno=%d, recv=%d", > + gt->info.id, seqno, gt->tlb_invalidation.seqno_recv); > + xe_guc_ct_print(&guc->ct, &p, true); > + return -ETIME; > + } > } > > return 0; > -- > 2.25.1 > ^ permalink raw reply [flat|nested] 6+ messages in thread
* Re: [PATCH] drm/xe: Skip TLB invalidation time out log if ct is disabled 2024-02-20 15:05 ` [PATCH] " Matthew Brost @ 2024-02-20 18:07 ` Matthew Auld 2024-02-20 18:35 ` Matthew Brost 0 siblings, 1 reply; 6+ messages in thread From: Matthew Auld @ 2024-02-20 18:07 UTC (permalink / raw) To: Matthew Brost, Shuicheng Lin; +Cc: intel-xe On 20/02/2024 15:05, Matthew Brost wrote: > On Tue, Feb 20, 2024 at 02:13:56AM +0000, Shuicheng Lin wrote: >> Suspend may cause the TLB invalidation time out as below log. >> Skip the log print if ct is disabled to make log clean. >> " >> [ 228.812266] xe_gt_tlb_invalidation_wait enter >> [ 228.812311] xe_gt_suspend enter >> [ 228.812782] xe 0000:03:00.0: [drm] GT0: suspended >> [ 228.812786] xe_gt_suspend enter >> [ 228.813508] xe 0000:03:00.0: [drm] GT1: suspended >> … >> [ 229.067007] xe 0000:03:00.0: [drm] *ERROR* TILE0 [GTT] GT0: TLB invalidation time'd out, seqno=321, recv=319 >> [ 229.067099] xe 0000:03:00.0: [drm] *ERROR* GT0: CT disabled >> " >> > > This doesn't look right for a few reasons. > - The timeout still can race suspend and then a resume > - The xe_guc_ct_enabled check also supresses the -ETIME return > - I think this message it actually valid > > What should probably be done is signal all pending TLB invalidations on > suspend. I think we are doing a bit of rework in [1] in this area too. > I'd say let's get [1] to land and if this is still an issue fixup the > suspend path to signal all TLB invalidation waiters. Signaling all > waiters on suspend shoud avoid having this message be printed. I think [1] will only help with rpm, also currently all callers of xe_gt_tlb_invalidation_wait() will always have an rpm ref anyway, AFAICT. There is the forced suspend path which is quite a different beast though, so likely it is there where we need to be more solid? > > Matt > > [1] https://patchwork.freedesktop.org/series/129217/ > >> Signed-off-by: Shuicheng Lin <shuicheng.lin@intel.com> >> --- >> drivers/gpu/drm/xe/xe_gt_tlb_invalidation.c | 17 ++++++++++++----- >> 1 file changed, 12 insertions(+), 5 deletions(-) >> >> diff --git a/drivers/gpu/drm/xe/xe_gt_tlb_invalidation.c b/drivers/gpu/drm/xe/xe_gt_tlb_invalidation.c >> index 7b3a54748b49..8aac12efea84 100644 >> --- a/drivers/gpu/drm/xe/xe_gt_tlb_invalidation.c >> +++ b/drivers/gpu/drm/xe/xe_gt_tlb_invalidation.c >> @@ -330,11 +330,18 @@ int xe_gt_tlb_invalidation_wait(struct xe_gt *gt, int seqno) >> if (!ret) { >> struct drm_printer p = xe_gt_err_printer(gt); >> >> - xe_tile_report_driver_error(gt_to_tile(gt), XE_TILE_DRV_ERR_GTT, >> - "GT%u: TLB invalidation time'd out, seqno=%d, recv=%d", >> - gt->info.id, seqno, gt->tlb_invalidation.seqno_recv); >> - xe_guc_ct_print(&guc->ct, &p, true); >> - return -ETIME; >> + /* >> + * guc ct may be disabled during the waiting period and lead to the timeout. >> + * Such as power suspend just after this tlb invalidation wait. >> + * Skip the error log print if ct is disabled. >> + */ >> + if (xe_guc_ct_enabled(&guc->ct)) { >> + xe_tile_report_driver_error(gt_to_tile(gt), XE_TILE_DRV_ERR_GTT, >> + "GT%u: TLB invalidation time'd out, seqno=%d, recv=%d", >> + gt->info.id, seqno, gt->tlb_invalidation.seqno_recv); >> + xe_guc_ct_print(&guc->ct, &p, true); >> + return -ETIME; >> + } >> } >> >> return 0; >> -- >> 2.25.1 >> ^ permalink raw reply [flat|nested] 6+ messages in thread
* Re: [PATCH] drm/xe: Skip TLB invalidation time out log if ct is disabled 2024-02-20 18:07 ` Matthew Auld @ 2024-02-20 18:35 ` Matthew Brost 2024-02-21 0:43 ` Lin, Shuicheng 0 siblings, 1 reply; 6+ messages in thread From: Matthew Brost @ 2024-02-20 18:35 UTC (permalink / raw) To: Matthew Auld; +Cc: Shuicheng Lin, intel-xe On Tue, Feb 20, 2024 at 06:07:00PM +0000, Matthew Auld wrote: > On 20/02/2024 15:05, Matthew Brost wrote: > > On Tue, Feb 20, 2024 at 02:13:56AM +0000, Shuicheng Lin wrote: > > > Suspend may cause the TLB invalidation time out as below log. > > > Skip the log print if ct is disabled to make log clean. > > > " > > > [ 228.812266] xe_gt_tlb_invalidation_wait enter > > > [ 228.812311] xe_gt_suspend enter > > > [ 228.812782] xe 0000:03:00.0: [drm] GT0: suspended > > > [ 228.812786] xe_gt_suspend enter > > > [ 228.813508] xe 0000:03:00.0: [drm] GT1: suspended > > > … > > > [ 229.067007] xe 0000:03:00.0: [drm] *ERROR* TILE0 [GTT] GT0: TLB invalidation time'd out, seqno=321, recv=319 > > > [ 229.067099] xe 0000:03:00.0: [drm] *ERROR* GT0: CT disabled > > > " > > > > > > > This doesn't look right for a few reasons. > > - The timeout still can race suspend and then a resume > > - The xe_guc_ct_enabled check also supresses the -ETIME return > > - I think this message it actually valid > > > > What should probably be done is signal all pending TLB invalidations on > > suspend. I think we are doing a bit of rework in [1] in this area too. > > I'd say let's get [1] to land and if this is still an issue fixup the > > suspend path to signal all TLB invalidation waiters. Signaling all > > waiters on suspend shoud avoid having this message be printed. > > I think [1] will only help with rpm, also currently all callers of > xe_gt_tlb_invalidation_wait() will always have an rpm ref anyway, AFAICT. > There is the forced suspend path which is quite a different beast though, so > likely it is there where we need to be more solid? > Yes it likely not the rpm path here but still think it worth getting [1] in and perhaps a version [2] in first though. Matt [1] https://patchwork.freedesktop.org/series/129217/ [2] https://patchwork.freedesktop.org/series/122772/ > > > > Matt > > > > [1] https://patchwork.freedesktop.org/series/129217/ > > > > > Signed-off-by: Shuicheng Lin <shuicheng.lin@intel.com> > > > --- > > > drivers/gpu/drm/xe/xe_gt_tlb_invalidation.c | 17 ++++++++++++----- > > > 1 file changed, 12 insertions(+), 5 deletions(-) > > > > > > diff --git a/drivers/gpu/drm/xe/xe_gt_tlb_invalidation.c b/drivers/gpu/drm/xe/xe_gt_tlb_invalidation.c > > > index 7b3a54748b49..8aac12efea84 100644 > > > --- a/drivers/gpu/drm/xe/xe_gt_tlb_invalidation.c > > > +++ b/drivers/gpu/drm/xe/xe_gt_tlb_invalidation.c > > > @@ -330,11 +330,18 @@ int xe_gt_tlb_invalidation_wait(struct xe_gt *gt, int seqno) > > > if (!ret) { > > > struct drm_printer p = xe_gt_err_printer(gt); > > > - xe_tile_report_driver_error(gt_to_tile(gt), XE_TILE_DRV_ERR_GTT, > > > - "GT%u: TLB invalidation time'd out, seqno=%d, recv=%d", > > > - gt->info.id, seqno, gt->tlb_invalidation.seqno_recv); > > > - xe_guc_ct_print(&guc->ct, &p, true); > > > - return -ETIME; > > > + /* > > > + * guc ct may be disabled during the waiting period and lead to the timeout. > > > + * Such as power suspend just after this tlb invalidation wait. > > > + * Skip the error log print if ct is disabled. > > > + */ > > > + if (xe_guc_ct_enabled(&guc->ct)) { > > > + xe_tile_report_driver_error(gt_to_tile(gt), XE_TILE_DRV_ERR_GTT, > > > + "GT%u: TLB invalidation time'd out, seqno=%d, recv=%d", > > > + gt->info.id, seqno, gt->tlb_invalidation.seqno_recv); > > > + xe_guc_ct_print(&guc->ct, &p, true); > > > + return -ETIME; > > > + } > > > } > > > return 0; > > > -- > > > 2.25.1 > > > ^ permalink raw reply [flat|nested] 6+ messages in thread
* RE: [PATCH] drm/xe: Skip TLB invalidation time out log if ct is disabled 2024-02-20 18:35 ` Matthew Brost @ 2024-02-21 0:43 ` Lin, Shuicheng 0 siblings, 0 replies; 6+ messages in thread From: Lin, Shuicheng @ 2024-02-21 0:43 UTC (permalink / raw) To: Brost, Matthew, Auld, Matthew; +Cc: intel-xe@lists.freedesktop.org Thanks all for the review. Will check it again after the 2 series are merged. Best Regards Shuicheng > -----Original Message----- > From: Brost, Matthew <matthew.brost@intel.com> > Sent: Wednesday, February 21, 2024 2:35 AM > To: Auld, Matthew <matthew.auld@intel.com> > Cc: Lin, Shuicheng <shuicheng.lin@intel.com>; intel-xe@lists.freedesktop.org > Subject: Re: [PATCH] drm/xe: Skip TLB invalidation time out log if ct is disabled > > On Tue, Feb 20, 2024 at 06:07:00PM +0000, Matthew Auld wrote: > > On 20/02/2024 15:05, Matthew Brost wrote: > > > On Tue, Feb 20, 2024 at 02:13:56AM +0000, Shuicheng Lin wrote: > > > > Suspend may cause the TLB invalidation time out as below log. > > > > Skip the log print if ct is disabled to make log clean. > > > > " > > > > [ 228.812266] xe_gt_tlb_invalidation_wait enter [ 228.812311] > > > > xe_gt_suspend enter [ 228.812782] xe 0000:03:00.0: [drm] GT0: > > > > suspended [ 228.812786] xe_gt_suspend enter [ 228.813508] xe > > > > 0000:03:00.0: [drm] GT1: suspended … [ 229.067007] xe > > > > 0000:03:00.0: [drm] *ERROR* TILE0 [GTT] GT0: TLB invalidation > > > > time'd out, seqno=321, recv=319 [ 229.067099] xe 0000:03:00.0: > > > > [drm] *ERROR* GT0: CT disabled " > > > > > > > > > > This doesn't look right for a few reasons. > > > - The timeout still can race suspend and then a resume > > > - The xe_guc_ct_enabled check also supresses the -ETIME return > > > - I think this message it actually valid > > > > > > What should probably be done is signal all pending TLB invalidations > > > on suspend. I think we are doing a bit of rework in [1] in this area too. > > > I'd say let's get [1] to land and if this is still an issue fixup > > > the suspend path to signal all TLB invalidation waiters. Signaling > > > all waiters on suspend shoud avoid having this message be printed. > > > > I think [1] will only help with rpm, also currently all callers of > > xe_gt_tlb_invalidation_wait() will always have an rpm ref anyway, AFAICT. > > There is the forced suspend path which is quite a different beast > > though, so likely it is there where we need to be more solid? > > > > Yes it likely not the rpm path here but still think it worth getting [1] in and > perhaps a version [2] in first though. > > Matt > > [1] https://patchwork.freedesktop.org/series/129217/ > [2] https://patchwork.freedesktop.org/series/122772/ > > > > > > > Matt > > > > > > [1] https://patchwork.freedesktop.org/series/129217/ > > > > > > > Signed-off-by: Shuicheng Lin <shuicheng.lin@intel.com> > > > > --- > > > > drivers/gpu/drm/xe/xe_gt_tlb_invalidation.c | 17 ++++++++++++----- > > > > 1 file changed, 12 insertions(+), 5 deletions(-) > > > > > > > > diff --git a/drivers/gpu/drm/xe/xe_gt_tlb_invalidation.c > > > > b/drivers/gpu/drm/xe/xe_gt_tlb_invalidation.c > > > > index 7b3a54748b49..8aac12efea84 100644 > > > > --- a/drivers/gpu/drm/xe/xe_gt_tlb_invalidation.c > > > > +++ b/drivers/gpu/drm/xe/xe_gt_tlb_invalidation.c > > > > @@ -330,11 +330,18 @@ int xe_gt_tlb_invalidation_wait(struct xe_gt > *gt, int seqno) > > > > if (!ret) { > > > > struct drm_printer p = xe_gt_err_printer(gt); > > > > - xe_tile_report_driver_error(gt_to_tile(gt), > XE_TILE_DRV_ERR_GTT, > > > > - "GT%u: TLB invalidation time'd out, > seqno=%d, recv=%d", > > > > - gt->info.id, seqno, gt- > >tlb_invalidation.seqno_recv); > > > > - xe_guc_ct_print(&guc->ct, &p, true); > > > > - return -ETIME; > > > > + /* > > > > + * guc ct may be disabled during the waiting period and lead to > the timeout. > > > > + * Such as power suspend just after this tlb invalidation wait. > > > > + * Skip the error log print if ct is disabled. > > > > + */ > > > > + if (xe_guc_ct_enabled(&guc->ct)) { > > > > + xe_tile_report_driver_error(gt_to_tile(gt), > XE_TILE_DRV_ERR_GTT, > > > > + "GT%u: TLB invalidation > time'd out, seqno=%d, recv=%d", > > > > + gt->info.id, seqno, gt- > >tlb_invalidation.seqno_recv); > > > > + xe_guc_ct_print(&guc->ct, &p, true); > > > > + return -ETIME; > > > > + } > > > > } > > > > return 0; > > > > -- > > > > 2.25.1 > > > > ^ permalink raw reply [flat|nested] 6+ messages in thread
end of thread, other threads:[~2024-02-21 0:43 UTC | newest] Thread overview: 6+ messages (download: mbox.gz follow: Atom feed -- links below jump to the message on this page -- 2024-02-20 2:13 [PATCH] drm/xe: Skip TLB invalidation time out log if ct is disabled Shuicheng Lin 2024-02-20 2:19 ` ✗ CI.Patch_applied: failure for " Patchwork 2024-02-20 15:05 ` [PATCH] " Matthew Brost 2024-02-20 18:07 ` Matthew Auld 2024-02-20 18:35 ` Matthew Brost 2024-02-21 0:43 ` Lin, Shuicheng
This is a public inbox, see mirroring instructions for how to clone and mirror all data and code used for this inbox