From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from gabe.freedesktop.org (gabe.freedesktop.org [131.252.210.177]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.lore.kernel.org (Postfix) with ESMTPS id 0C3DAE77180 for ; Thu, 12 Dec 2024 17:31:00 +0000 (UTC) Received: from gabe.freedesktop.org (localhost [127.0.0.1]) by gabe.freedesktop.org (Postfix) with ESMTP id C6BCD10E3E3; Thu, 12 Dec 2024 17:30:59 +0000 (UTC) Authentication-Results: gabe.freedesktop.org; dkim=pass (2048-bit key; unprotected) header.d=intel.com header.i=@intel.com header.b="kV60Rxta"; dkim-atps=neutral Received: from mgamail.intel.com (mgamail.intel.com [198.175.65.14]) by gabe.freedesktop.org (Postfix) with ESMTPS id 72B8D10E3E3 for ; Thu, 12 Dec 2024 17:30:58 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=intel.com; i=@intel.com; q=dns/txt; s=Intel; t=1734024659; x=1765560659; h=message-id:date:mime-version:subject:to:cc:references: from:in-reply-to; bh=qBJysWKN8M9v51By+v1kETYL841cU8jMY2vQhTcclHI=; b=kV60RxtaiFyva5Y0KePqo0KypMp1VF5MarzinJUH7+1x4CrK2iP9yKOO ePY6X1gcMitc0bEJYSAlvqGencvhaHIGp93OzcG19H9ayBHp32UtBemiM QwBmgID158fnO1OAF9/vopQDKYVJU+v/QoH28XhwmLKFUmzK94cGfTvlI TYvDuAdyIP1fkv3Vg7Rhtbs7hoVl/kDmX6S9maJCnI16GY0KfLChuYGE3 zb/ly3ObRl8NKbIlLdaqT2M3nRJnJ962VNtMdogEGh3F3E4jmDq1MjDs4 N+Iye5NZlCff5mlgXtQn2+XBAzByxTbG9IkWB2Un5e875QUMrVcW7KrH2 A==; X-CSE-ConnectionGUID: jKVQggChR2WANvtQn9BY9Q== X-CSE-MsgGUID: eU2J7NRURu2nxPJvYqABrQ== X-IronPort-AV: E=McAfee;i="6700,10204,11284"; a="38242635" X-IronPort-AV: E=Sophos;i="6.12,229,1728975600"; d="scan'208,217";a="38242635" Received: from fmviesa010.fm.intel.com ([10.60.135.150]) by orvoesa106.jf.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 12 Dec 2024 09:30:58 -0800 X-CSE-ConnectionGUID: TrmvimjgQBePMuW1vUXZ9Q== X-CSE-MsgGUID: Fd7c5LCRRxuoxz0CA1sRjg== X-ExtLoop1: 1 X-IronPort-AV: E=Sophos;i="6.12,229,1728975600"; d="scan'208,217";a="96700667" Received: from uweinrib-mobl1.ger.corp.intel.com (HELO [10.245.149.72]) ([10.245.149.72]) by fmviesa010-auth.fm.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 12 Dec 2024 09:30:55 -0800 Content-Type: multipart/alternative; boundary="------------dDwiGds8G8ytj73E9V4DsaFk" Message-ID: Date: Thu, 12 Dec 2024 18:30:51 +0100 MIME-Version: 1.0 User-Agent: Mozilla Thunderbird Subject: Re: [PATCH v2] drm/xe/xe_sync: Add debug printing to check_ufence To: "Cavitt, Jonathan" , "intel-xe@lists.freedesktop.org" Cc: "Gupta, saurabhg" , "Zuo, Alex" , "Vivi, Rodrigo" References: <20241206181155.87070-1-jonathan.cavitt@intel.com> <103d96a4-753b-4dd1-9e43-e75b12ef10bf@linux.intel.com> Content-Language: en-US From: Nirmoy Das In-Reply-To: X-BeenThere: intel-xe@lists.freedesktop.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: Intel Xe graphics driver List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: intel-xe-bounces@lists.freedesktop.org Sender: "Intel-xe" This is a multi-part message in MIME format. --------------dDwiGds8G8ytj73E9V4DsaFk Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit On 12/11/2024 4:24 PM, Cavitt, Jonathan wrote: > >   > >   > > *From:*Nirmoy Das > *Sent:* Friday, December 6, 2024 11:28 PM > *To:* Cavitt, Jonathan ; intel-xe@lists.freedesktop.org > *Cc:* Gupta, saurabhg ; Zuo, Alex ; Vivi, Rodrigo > *Subject:* Re: [PATCH v2] drm/xe/xe_sync: Add debug printing to check_ufence > >   > >   > > On 12/6/2024 7:11 PM, Jonathan Cavitt wrote: > > The xe_sync helper function check_ufence can occasionally report EBUSY > > if the ufence has not been signalled yet.  EBUSY is a non-fatal error > > value for the function, so it is not desireable to warn in cases where > > EBUSY is reported because it is up to the user to decide if EBUSY is a > > fatal error in their use cases.  However, we can and should report EBUSY > > to the debug logs for diagnostic purposes. > >   > > v2: Use vm_dbg instead of XE_IOCTL_DBG (Rodrigo) > >   > > Signed-off-by: Jonathan Cavitt > > CC: Rodrigo Vivi > > --- > > drivers/gpu/drm/xe/xe_vm.c | 10 ++++++++-- > > 1 file changed, 8 insertions(+), 2 deletions(-) > >   > > diff --git a/drivers/gpu/drm/xe/xe_vm.c b/drivers/gpu/drm/xe/xe_vm.c > > index 74d684708b00..8c770d1b916c 100644 > > --- a/drivers/gpu/drm/xe/xe_vm.c > > +++ b/drivers/gpu/drm/xe/xe_vm.c > > @@ -2402,8 +2402,11 @@ static int op_lock_and_prep(struct drm_exec *exec, struct xe_vm *vm, > >           break; > >   case DRM_GPUVA_OP_REMAP: > >           err = check_ufence(gpuva_to_vma(op->base.remap.unmap->va)); > > -          if (err) > > +          if (err) { > > +                  vm_dbg(&vm->xe->drm, > > +                         "REMAP: vma check ufence status = %i\n", err); > > Move that to check_ufence() instead so there there is only one copy of logging ? > > IMO I think there’s value in knowing which operation is failing, and that information would > > be lost if we moved the logging into check_ufence. > igt stack strace[1] dumps the operation info and any UMD should be able know the failed operation but not against as it is now: Reviewed-by: Nirmoy Das [1] https://intel-gfx-ci.01.org/tree/intel-xe/IGTPW_12052/shard-bmg-8/igt@xe_exec_compute_mode@many-rebind.html > -Jonathan Cavitt > > Regards, > > Nirmoy > > > >   > >                   break; > > +          } > > > >            err = vma_lock_and_validate(exec, > >                                      gpuva_to_vma(op->base.remap.unmap->va), > > @@ -2415,8 +2418,11 @@ static int op_lock_and_prep(struct drm_exec *exec, struct xe_vm *vm, > >           break; > >   case DRM_GPUVA_OP_UNMAP: > >           err = check_ufence(gpuva_to_vma(op->base.unmap.va)); > > -          if (err) > > +          if (err) { > > +                  vm_dbg(&vm->xe->drm, > > +                         "UNMAP: vma check ufence status = %i\n", err); > >                   break; > > +          } > > > >            err = vma_lock_and_validate(exec, > >                                      gpuva_to_vma(op->base.unmap.va), > --------------dDwiGds8G8ytj73E9V4DsaFk Content-Type: text/html; charset=UTF-8 Content-Transfer-Encoding: 8bit


On 12/11/2024 4:24 PM, Cavitt, Jonathan wrote:

 

 

From: Nirmoy Das <nirmoy.das@linux.intel.com>
Sent: Friday, December 6, 2024 11:28 PM
To: Cavitt, Jonathan <jonathan.cavitt@intel.com>; intel-xe@lists.freedesktop.org
Cc: Gupta, saurabhg <saurabhg.gupta@intel.com>; Zuo, Alex <alex.zuo@intel.com>; Vivi, Rodrigo <rodrigo.vivi@intel.com>
Subject: Re: [PATCH v2] drm/xe/xe_sync: Add debug printing to check_ufence

 

 

On 12/6/2024 7:11 PM, Jonathan Cavitt wrote:

The xe_sync helper function check_ufence can occasionally report EBUSY
if the ufence has not been signalled yet.  EBUSY is a non-fatal error
value for the function, so it is not desireable to warn in cases where
EBUSY is reported because it is up to the user to decide if EBUSY is a
fatal error in their use cases.  However, we can and should report EBUSY
to the debug logs for diagnostic purposes.
 
v2: Use vm_dbg instead of XE_IOCTL_DBG (Rodrigo)
 
Signed-off-by: Jonathan Cavitt <jonathan.cavitt@intel.com>
CC: Rodrigo Vivi <rodrigo.vivi@intel.com>
---
 drivers/gpu/drm/xe/xe_vm.c | 10 ++++++++--
 1 file changed, 8 insertions(+), 2 deletions(-)
 
diff --git a/drivers/gpu/drm/xe/xe_vm.c b/drivers/gpu/drm/xe/xe_vm.c
index 74d684708b00..8c770d1b916c 100644
--- a/drivers/gpu/drm/xe/xe_vm.c
+++ b/drivers/gpu/drm/xe/xe_vm.c
@@ -2402,8 +2402,11 @@ static int op_lock_and_prep(struct drm_exec *exec, struct xe_vm *vm,
           break;
   case DRM_GPUVA_OP_REMAP:
           err = check_ufence(gpuva_to_vma(op->base.remap.unmap->va));
-          if (err)
+          if (err) {
+                  vm_dbg(&vm->xe->drm,
+                         "REMAP: vma check ufence status = %i\n", err);

Move that to check_ufence() instead so there there is only one copy of logging ?

IMO I think there’s value in knowing which operation is failing, and that information would

be lost if we moved the logging into check_ufence.

igt stack strace[1] dumps the operation info and any UMD should be able know the failed operation but not against as it is now:

Reviewed-by: Nirmoy Das <nirmoy.das@intel.com>


[1] https://intel-gfx-ci.01.org/tree/intel-xe/IGTPW_12052/shard-bmg-8/igt@xe_exec_compute_mode@many-rebind.html

-Jonathan Cavitt

Regards,

Nirmoy



 
                   break;
+          }
 
           err = vma_lock_and_validate(exec,
                                      gpuva_to_vma(op->base.remap.unmap->va),
@@ -2415,8 +2418,11 @@ static int op_lock_and_prep(struct drm_exec *exec, struct xe_vm *vm,
           break;
   case DRM_GPUVA_OP_UNMAP:
           err = check_ufence(gpuva_to_vma(op->base.unmap.va));
-          if (err)
+          if (err) {
+                  vm_dbg(&vm->xe->drm,
+                         "UNMAP: vma check ufence status = %i\n", err);
                   break;
+          }
 
           err = vma_lock_and_validate(exec,
                                      gpuva_to_vma(op->base.unmap.va),
--------------dDwiGds8G8ytj73E9V4DsaFk--