From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from gabe.freedesktop.org (gabe.freedesktop.org [131.252.210.177]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.lore.kernel.org (Postfix) with ESMTPS id 0C24EC001DF for ; Wed, 26 Jul 2023 18:33:54 +0000 (UTC) Received: from gabe.freedesktop.org (localhost [127.0.0.1]) by gabe.freedesktop.org (Postfix) with ESMTP id C983810E2C9; Wed, 26 Jul 2023 18:33:53 +0000 (UTC) Received: from mga05.intel.com (mga05.intel.com [192.55.52.43]) by gabe.freedesktop.org (Postfix) with ESMTPS id 8291210E2C9 for ; Wed, 26 Jul 2023 18:33:52 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=intel.com; i=@intel.com; q=dns/txt; s=Intel; t=1690396432; x=1721932432; h=date:from:to:cc:subject:message-id:references: content-transfer-encoding:in-reply-to:mime-version; bh=35hSxY/6PzftIeb2G/Is1bLtaK4WwPe4G0ps5YMjnX4=; b=eXsa7KsoUZyTaRV9DaixD/ndICfgpYzOhkWF5nUovqXx4fQkUpnmHp7d esJaL0BH6jBuid595/SHfxTrTqjsGPhy7lAb+ilgKJkciQpBSjb+Jyn34 NOwoDXTrMaTu7B90OH9sihZJiTqBDcgJIxYIT7o/PwTlk9b4wEeIdxeq7 iUNRf0JxTZ+lQsVX7aRx38zbrN7rcKytWBujNhYZ4Nz0BZtr2rQ1D85Pq fAdk1aapfimDZ5oDGyBC/izd01od1VDZo6wC3ZLH1ZWbpe7hENDDI4x5Z mtPEY7aT5SWrsuieAIJzSrABx0brLP6ghCSjxHwOCeqUPzSnbcx9OJ6pF Q==; X-IronPort-AV: E=McAfee;i="6600,9927,10783"; a="454471894" X-IronPort-AV: E=Sophos;i="6.01,232,1684825200"; d="scan'208";a="454471894" Received: from orsmga004.jf.intel.com ([10.7.209.38]) by fmsmga105.fm.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 26 Jul 2023 11:33:51 -0700 X-ExtLoop1: 1 X-IronPort-AV: E=McAfee;i="6600,9927,10783"; a="850567984" X-IronPort-AV: E=Sophos;i="6.01,232,1684825200"; d="scan'208";a="850567984" Received: from fmsmsx601.amr.corp.intel.com ([10.18.126.81]) by orsmga004.jf.intel.com with ESMTP; 26 Jul 2023 11:33:51 -0700 Received: from fmsmsx610.amr.corp.intel.com (10.18.126.90) by fmsmsx601.amr.corp.intel.com (10.18.126.81) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_128_GCM_SHA256) id 15.1.2507.27; Wed, 26 Jul 2023 11:33:50 -0700 Received: from fmsedg601.ED.cps.intel.com (10.1.192.135) by fmsmsx610.amr.corp.intel.com (10.18.126.90) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_128_GCM_SHA256) id 15.1.2507.27 via Frontend Transport; Wed, 26 Jul 2023 11:33:50 -0700 Received: from NAM12-BN8-obe.outbound.protection.outlook.com (104.47.55.175) by edgegateway.intel.com (192.55.55.70) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id 15.1.2507.27; Wed, 26 Jul 2023 11:33:50 -0700 ARC-Seal: i=1; a=rsa-sha256; s=arcselector9901; d=microsoft.com; cv=none; b=GCiaHCURehrC1yaI+f1Lp3RF6HqKm7qZDxPzQX9r4TH6PgUkAeySLXzNwIRJ62HKiID+Um9zShsKFVegb4ZsK1ZV1lmxf3KrLvBrz0iFZu5nPf0XUM3DYIBGhEUCpuZa0BYQ4hZANX1ODK7w0sxeA3wHZeK0Zgmdt2zwbM+ehLigryEiGlVPfXerwhD+KS3HK9yx/ByI74E3sOHksuhgF72RuYEGJUxccd/XxvmKBCU33KH8egGCnLU/Gca6zfw1Vy28HJVNmH4MSD9H1Ny9izo4KWmVxqXtM1mxwpR9TDhsLIPhigN9rJSh2xMvgNe6Z3wUqQ4Xd6rOIWPkppfRvA== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=microsoft.com; s=arcselector9901; h=From:Date:Subject:Message-ID:Content-Type:MIME-Version:X-MS-Exchange-AntiSpam-MessageData-ChunkCount:X-MS-Exchange-AntiSpam-MessageData-0:X-MS-Exchange-AntiSpam-MessageData-1; bh=JEvTmgExoqmxl/vMyFH0cfjMU2kNup443+WNREScnmk=; b=Qm4EPuuaxj4Jr3uhez8TldhpY9uE0I3gwy2Dit7Fe4vWBvHlsWs0L8I4I4PxvLbF3IaVBSosNJQt7qFUvZFOpX9S/i+Ej+fpdrqkZTlwSkZ4ySJfkK4oJNw0zGF1qybjLE0vIpZmBNRNPB1linDcSA0lR46ol3ubXlX/CqPurEX/t35kuip0IPdx+9+x44JUxzMUQfal/9c3kI9TCcvedilCId8kRVEOg2cSWwyPNpWJX12DiTm/jOZx+RA/sQidpxIL3LUfbcRGmXY51/iSCGxA56XDJsvEnmBKpiBrxJgcqUFeS1eLFpOckKXH19FZ776aYYuAvunPzNx+d6XthQ== ARC-Authentication-Results: i=1; mx.microsoft.com 1; spf=pass smtp.mailfrom=intel.com; dmarc=pass action=none header.from=intel.com; dkim=pass header.d=intel.com; arc=none Authentication-Results: dkim=none (message not signed) header.d=none;dmarc=none action=none header.from=intel.com; Received: from MN0PR11MB6059.namprd11.prod.outlook.com (2603:10b6:208:377::9) by DM6PR11MB4564.namprd11.prod.outlook.com (2603:10b6:5:2a0::7) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id 15.20.6631.29; Wed, 26 Jul 2023 18:33:43 +0000 Received: from MN0PR11MB6059.namprd11.prod.outlook.com ([fe80::7f94:b6c4:1ce2:294]) by MN0PR11MB6059.namprd11.prod.outlook.com ([fe80::7f94:b6c4:1ce2:294%5]) with mapi id 15.20.6609.032; Wed, 26 Jul 2023 18:33:43 +0000 Date: Wed, 26 Jul 2023 14:33:39 -0400 From: Rodrigo Vivi To: "Ghimiray, Himal Prasad" Message-ID: References: <20230725155115.3759312-1-himal.prasad.ghimiray@intel.com> <20230725155115.3759312-3-himal.prasad.ghimiray@intel.com> Content-Type: text/plain; charset="iso-8859-1" Content-Disposition: inline Content-Transfer-Encoding: 8bit In-Reply-To: X-ClientProxiedBy: SJ0PR03CA0010.namprd03.prod.outlook.com (2603:10b6:a03:33a::15) To MN0PR11MB6059.namprd11.prod.outlook.com (2603:10b6:208:377::9) MIME-Version: 1.0 X-MS-PublicTrafficType: Email X-MS-TrafficTypeDiagnostic: MN0PR11MB6059:EE_|DM6PR11MB4564:EE_ X-MS-Office365-Filtering-Correlation-Id: f1d4e1fd-9aee-4b59-3daa-08db8e06d6fa X-MS-Exchange-SenderADCheck: 1 X-MS-Exchange-AntiSpam-Relay: 0 X-Microsoft-Antispam: BCL:0; X-Microsoft-Antispam-Message-Info: PIuT8NtrBPxpFoSbUlk0HO90+0whUaQj0J9sy1DsUhCQX9F8dkQWn5xMpHAlt/jQPcBANiRDsuR3E02Ylkhqd+XI93JY03c9eOAnoRSxUtczq56t33KYZjbfr7rIKQXFO86qZE3+bDAniGZYb/7kUEs2AIyptKGKIR0h/hetlHdUh5e7BruxWq5gjHf1VMGjo3SH55ozsgJDTwro0gwShOBC/3gEC/nXKzWtetwx9WzO07ytEpeb3zMFYxLkoZglILTpNXKJHxV54pjCEhV8bN8DiTnmTQ1JQLd+K2dRquxFtsR0A/tS26a07/wCjLrlW5+4pa++FzqHVDpZqCmGaQPMp+8qB6Is/7BjHHnkgnCiCVCkAEnDyl2u181X9M143ohJWWj9xZ349zfdSc6j36/T3LopiNm8IqpMG/Yl7l+VQBcWmx9Esig2GvYblV8YV3D+5IiKIzxaw7HuUw6z6kR0+Q+CPZ2KnByC087uCC2Xe9aVjlqhm/kOiiNEklGkZ0+09bk9Hb3t8J3oGg/zeBlOpvSFe8PorGu1rrLXafYOQbHsKRYtIs3idIB/EzmS X-Forefront-Antispam-Report: CIP:255.255.255.255; CTRY:; LANG:en; SCL:1; SRV:; IPV:NLI; SFV:NSPM; H:MN0PR11MB6059.namprd11.prod.outlook.com; PTR:; CAT:NONE; SFS:(13230028)(366004)(39860400002)(396003)(346002)(376002)(136003)(451199021)(8676002)(6862004)(8936002)(5660300002)(2616005)(316002)(53546011)(26005)(6506007)(86362001)(186003)(4326008)(37006003)(6486002)(6666004)(66556008)(66476007)(44832011)(6512007)(66946007)(41300700001)(6636002)(83380400001)(478600001)(82960400001)(36756003)(2906002)(38100700002); DIR:OUT; SFP:1102; X-MS-Exchange-AntiSpam-MessageData-ChunkCount: 1 X-MS-Exchange-AntiSpam-MessageData-0: =?iso-8859-1?Q?LbGYwbtFWjhZ+4rASn7fh1woh/TAWXzaDE99nwO5BLo4tJA18+tK7P472j?= =?iso-8859-1?Q?lHg3SHltH79mP8MVw3qLiyh10k+kPusY1j/bhzQq5uMiKRoiIg/86XLPBn?= =?iso-8859-1?Q?VZygEyF0/e9GoBBVRLXhRKljNWLBoef20NyXBc6JDNP5mH087rSaegHbd5?= =?iso-8859-1?Q?qLVkDZpEECN0BGQFkzgJ2p4b+aYsk0f5l6LvMpAaTdtdg9pdFaca2lBY60?= =?iso-8859-1?Q?XMmeHQl/ckfoAuRdtYzFNZOQS9YArz8Z4uRX4OZoqolAZ7JeKod334krJQ?= =?iso-8859-1?Q?8Vfs0wkJZzCT8gQnmG0RIjbrMX/h0w4/bjardfOaY07hHNS2m8x1zxVjGP?= =?iso-8859-1?Q?sHdPTUhdEpo7cEuE8pZlN5N1NiPabbf7oImkExYn6k9ktdsHqVdvckbkub?= =?iso-8859-1?Q?b5c5lTdLsMKSbP5O1agKKqxQk9yGXvHDIXs/eveR/BCTJOPQtHREC6RlgP?= =?iso-8859-1?Q?8G9UvO3eMwV000b+TvX4UKsFbfri4wpELisg0iHtmLVtp5J2zWhB8Q1WH0?= =?iso-8859-1?Q?3U6+zpE63wX/N1IFs/zcPei68mX4YKE7GspwFjwog3vPL0kQF3EZaze7ne?= =?iso-8859-1?Q?YCZhiI+aQLyKzeqQt44HCcn0gJHCsTmPWzrVCKqyiCW8J2xnmb7x8hZz7m?= =?iso-8859-1?Q?7EtXc+E3tgnMNCbvZzpVHOQc4IwB/9zRNN+1RBE0iUvr55E4PEOT4IPjky?= =?iso-8859-1?Q?SkoPwddnpmFg691o2lVkTm2CZg4rcycDdHHLVFmLWNz0aOPnU5eML8XzTb?= =?iso-8859-1?Q?8x5/VWd5/Po2BS09tTXC1ERl0Yo/Ef3PJl5+2yEEJZGqdcr8cOw54mJtXV?= =?iso-8859-1?Q?uHjV5yG5SIPSypQugcse9el0/TUTNPMfXBaKLknYq31Vf1x3mVe2KRrFtY?= =?iso-8859-1?Q?I3D9kHP2OQuzGo78EmnhkStd0viN0xf2+tUQY6W65HhGhnLwOItS7wZ3DB?= =?iso-8859-1?Q?7bG82h4l56sCKxOGl1nBy+ewQKgpzrSjRXNY1KqnBKZJrC8njCiUq0qRz5?= =?iso-8859-1?Q?dm0oSek0ZL4mivsDAGnHSTVkxixctGTfeQ4wfKZMSXz7W+oTfsscEp2ubz?= =?iso-8859-1?Q?eK+PK3RoSGxrBKljVRxZ8gbAXX9oW3lq8GbQz+qt1CbZNDjZERvpEjAveD?= =?iso-8859-1?Q?hnbWBjrYJ8jIfMZe/WPUKjOWf+N89uuzhgLt2nwTI7VxOkvtF24SMm/O2G?= =?iso-8859-1?Q?ITdh1cUATWszuEqvjn1XnH7bZz+2o5hem6HyIVpV4pgh/oEYaGLduklGY5?= =?iso-8859-1?Q?DgEgJWkOgsw4vQIFybUb/rgpmb74xuT2RhlFUXzbvqt0hpn1r8ZaBB1ZCY?= =?iso-8859-1?Q?fSQ9E3ceToyxb+ajuGfWCfDmQ7S7CfWVejizd4BFgMqWjyCfcQ8ilYxJ/4?= =?iso-8859-1?Q?CVKgoaL27FumyJrLqcyqDOKVdcf6iyBrJh5TXp7BSLkRscR31/bSdcGz2z?= =?iso-8859-1?Q?ZWFDW/D3iHbLaP6F3z4Y75th+R8eizQSBihr6jgYn3pthxMktLILVYSPDv?= =?iso-8859-1?Q?/M4/zLG9f1SNNokvLuVmSh0Iii0HzqKKC03+t/baZoDsna6Wo1TYyad1tP?= =?iso-8859-1?Q?658BkC8vch20QTQZGvPJq+4gT5fuPVt4J/rbVRXUoemEkSrZJeXb19kHEi?= =?iso-8859-1?Q?lHuG6izMlotiJYL+Bn2+FJvI3JeBPQqAovOVCh37Mby1+oRZtFh1pL8A?= =?iso-8859-1?Q?=3D=3D?= X-MS-Exchange-CrossTenant-Network-Message-Id: f1d4e1fd-9aee-4b59-3daa-08db8e06d6fa X-MS-Exchange-CrossTenant-AuthSource: MN0PR11MB6059.namprd11.prod.outlook.com X-MS-Exchange-CrossTenant-AuthAs: Internal X-MS-Exchange-CrossTenant-OriginalArrivalTime: 26 Jul 2023 18:33:43.3456 (UTC) X-MS-Exchange-CrossTenant-FromEntityHeader: Hosted X-MS-Exchange-CrossTenant-Id: 46c98d88-e344-4ed4-8496-4ed7712e255d X-MS-Exchange-CrossTenant-MailboxType: HOSTED X-MS-Exchange-CrossTenant-UserPrincipalName: XswSw2uzOuYkMhiZu4dIrRqbWDiI8zN0x2vzwG63dy2CMt6D/l+pTqvoitAgNxc1PO/MZ5GTYiT+9QjQPTXC3Q== X-MS-Exchange-Transport-CrossTenantHeadersStamped: DM6PR11MB4564 X-OriginatorOrg: intel.com Subject: Re: [Intel-xe] [PATCH v8 2/3] drm/xe: Notify Userspace when gt reset fails X-BeenThere: intel-xe@lists.freedesktop.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: Intel Xe graphics driver List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Cc: intel-xe@lists.freedesktop.org Errors-To: intel-xe-bounces@lists.freedesktop.org Sender: "Intel-xe" On Wed, Jul 26, 2023 at 10:44:12PM +0530, Ghimiray, Himal Prasad wrote: > Hi Rodrigo, > > On 26-07-2023 20:09, Rodrigo Vivi wrote: > > On Tue, Jul 25, 2023 at 09:21:14PM +0530, Himal Prasad Ghimiray wrote: > > > Send uevent in case of gt reset failure. This intimation can be used by > > > userspace monitoring tool to do the device level reset/reboot > > > when GT reset fails. udevadm can be used to monitor the uevents. > > > > > > v2: > > > - Support only gt failure notification (Rodrigo) > > > > > > v3 > > > - Rectify the comments in header file. > > > > > > v4 > > > - Use pci kobj instead of drm kobj for notification.(Rodrigo) > > > - Cleanup (Badal) > > > > > > Cc: Aravind Iddamsetty > > > Cc: Tejas Upadhyay > > > Cc: Rodrigo Vivi > > > Reviewed-by: Badal Nilawar > > > Signed-off-by: Himal Prasad Ghimiray > > Cc: Matt Roper Matt Roper > > > > > --- > > > drivers/gpu/drm/xe/xe_gt.c | 17 +++++++++++++++++ > > > include/uapi/drm/xe_drm.h | 8 ++++++++ > > > 2 files changed, 25 insertions(+) > > > > > > diff --git a/drivers/gpu/drm/xe/xe_gt.c b/drivers/gpu/drm/xe/xe_gt.c > > > index 3e32d38aeeea..f4766fb6bfdb 100644 > > > --- a/drivers/gpu/drm/xe/xe_gt.c > > > +++ b/drivers/gpu/drm/xe/xe_gt.c > > > @@ -8,6 +8,7 @@ > > > #include > > > #include > > > +#include > > > #include "regs/xe_gt_regs.h" > > > #include "xe_bb.h" > > > @@ -500,6 +501,19 @@ static int do_gt_restart(struct xe_gt *gt) > > > return 0; > > > } > > > +static void xe_uevent_gt_reset_failure(struct pci_dev *pdev, u8 id) > > > +{ > > > + char *reset_event[4]; > > > + > > > + reset_event[0] = XE_RESET_FAILED_UEVENT "=NEEDS_RESET"; > > > + reset_event[1] = "RESET_FAILED=gt"; > > > + reset_event[2] = kasprintf(GFP_KERNEL, "RESET_ID=%d", id); > > should we also put which tile this is coming from? > > Matt? > > > > > + reset_event[3] = NULL; > > > + kobject_uevent_env(&pdev->dev.kobj, KOBJ_CHANGE, reset_event); > > Himal, could you please paste here an example of the output of this event > > when monitoring it with the: > > $ udevadm monitor > > ? > > Please find the output from udevadm  monitor below this is really great. Thank you. (more below) > > > KERNEL[471.352287] change > /devices/pci0000:89/0000:89:02.0/0000:8a:00.0/0000:8b:01.0/0000:8c:00.0 > (pci) > ACTION=change > DEVPATH=/devices/pci0000:89/0000:89:02.0/0000:8a:00.0/0000:8b:01.0/0000:8c:00.0 Since it is at the PCI level, we need to identify the tile. Maybe: TILE=%id? reset_event[x] = kasprintf(GFP_KERNEL, "TILE_ID=%d", id); > SUBSYSTEM=pci > DEVICE_STATUS=NEEDS_RESET > RESET_FAILED=gt What could be the other RESET_FAILED options? Could we get some code documentation along with this? > RESET_ID=0 maybe s/RESET_ID/GT_ID ? reset_event[x] = kasprintf(GFP_KERNEL, "GT_ID=%d", id); > DRIVER=xe > PCI_CLASS=38000 > PCI_ID=8086:0BD6 > PCI_SUBSYS_ID=8086:0000 > PCI_SLOT_NAME=0000:8c:00.0 > MODALIAS=pci:v00008086d00000BD6sv00008086sd00000000bc03sc80i00 > SEQNUM=8817 > > BR > > Himal > > > > > > > + > > > + kfree(reset_event[2]); > > > +} > > > + > > > static int gt_reset(struct xe_gt *gt) > > > { > > > int err; > > > @@ -550,6 +564,9 @@ static int gt_reset(struct xe_gt *gt) > > > xe_device_mem_access_put(gt_to_xe(gt)); > > > xe_gt_err(gt, "reset failed (%pe)\n", ERR_PTR(err)); > > > + /* Notify userspace about gt reset failure */ > > > + xe_uevent_gt_reset_failure(to_pci_dev(gt_to_xe(gt)->drm.dev), gt->info.id); > > > + > > > return err; > > > } > > > diff --git a/include/uapi/drm/xe_drm.h b/include/uapi/drm/xe_drm.h > > > index 347351a8f618..fdacee0a27c5 100644 > > > --- a/include/uapi/drm/xe_drm.h > > > +++ b/include/uapi/drm/xe_drm.h > > > @@ -16,6 +16,14 @@ extern "C" { > > > * subject to backwards-compatibility constraints. > > > */ > > > +/* > > > + * Uevent generated by xe on it's pci node. > > > + * > > > + * XE_RESET_FAILED_UEVENT - Event is generated when attempt to reset engine > > > + * fails. The value supplied with the event is always "NEEDS_RESET". > > > + */ > > > +#define XE_RESET_FAILED_UEVENT "DEVICE_STATUS" > > > + > > > /** > > > * struct xe_user_extension - Base class for defining a chain of extensions > > > * > > > -- > > > 2.25.1 > > >