From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from gabe.freedesktop.org (gabe.freedesktop.org [131.252.210.177]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.lore.kernel.org (Postfix) with ESMTPS id 9FFE3CCF9FE for ; Mon, 3 Nov 2025 08:59:30 +0000 (UTC) Received: from gabe.freedesktop.org (localhost [127.0.0.1]) by gabe.freedesktop.org (Postfix) with ESMTP id 21D9710E393; Mon, 3 Nov 2025 08:59:30 +0000 (UTC) Authentication-Results: gabe.freedesktop.org; dkim=pass (1024-bit key; unprotected) header.d=amd.com header.i=@amd.com header.b="wg3G9Hei"; dkim-atps=neutral Received: from BL2PR02CU003.outbound.protection.outlook.com (mail-eastusazon11011051.outbound.protection.outlook.com [52.101.52.51]) by gabe.freedesktop.org (Postfix) with ESMTPS id C9CC910E399 for ; Mon, 3 Nov 2025 08:59:28 +0000 (UTC) ARC-Seal: i=1; a=rsa-sha256; s=arcselector10001; d=microsoft.com; cv=none; b=n6jenfcg6Oze6+0LgZNDZfNqAGn69OWfpsIhdz/LEacqOmNsq/VVVFPN1X83AqNoYcpSAMFumskmg0U7NPxn4FFhccwLSAoKZioXfkspyzrVMFSE0wuEsQAE67neVRLwE41YptQkLybeTTtuCavDzv9bCKcQ+AVSOoHwg+9Nod+chUTB+3b0JJ4v20XhKWDDNXQvLkZ6RYWUISTIC/+Q7u2o69O1cnXIH1bIKvzNPmw9hRnGxpIBsWU1cd3IVFN31qZRreDLJ5XgzglCDa7pZacAwF702K9KMAT/h95q+LYFZqoiLgDDjoKm+3Snmcjd8zvY+F974NiIwRhg6MDbaw== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=microsoft.com; s=arcselector10001; h=From:Date:Subject:Message-ID:Content-Type:MIME-Version:X-MS-Exchange-AntiSpam-MessageData-ChunkCount:X-MS-Exchange-AntiSpam-MessageData-0:X-MS-Exchange-AntiSpam-MessageData-1; bh=4xCc/XgVKyXd3SV3epGCvcP0e+oxLp5SLbZF0ibcVZI=; b=L26HAnq3tdSvl4JYXcQjEBQlvPMUgkbnUvcxd/COXnQWbCrloB31jTenPdH4JXFijvp3H73k8EJ7uEVR5rZGkOZOeM35dJaVJh6/adQdzyS3GWMsbqHlqmOsBv4vOgeQ8fqzB9n+7fqJcUA7Q1C7TBuJt9o2l5zpnq9SgjqOG4yeJPe3L3hGj3axI75xB6XbPQsN/+Hm8vptxrn/EtZ1onIUINiezl8T02W0trmtHDvB4KkNbQRtGWb1e7qEdLcLkDhdUXDQBGyPvLIMT6o7Cn2VXwrzMWnKp8JbezegYZtR5wzkwfeBXbsvNJ6dYZdHDhPFkoT32yXpWn6uN2YMiw== ARC-Authentication-Results: i=1; mx.microsoft.com 1; spf=pass smtp.mailfrom=amd.com; dmarc=pass action=none header.from=amd.com; dkim=pass header.d=amd.com; arc=none DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=amd.com; s=selector1; h=From:Date:Subject:Message-ID:Content-Type:MIME-Version:X-MS-Exchange-SenderADCheck; bh=4xCc/XgVKyXd3SV3epGCvcP0e+oxLp5SLbZF0ibcVZI=; b=wg3G9HeiX0lQFEjBtsfJI75fm1WV5+JQqwYbwZiJfbp7TNDGZNPsWLkNZ2g8RQVsDwS4hpxGeTq8l8n0HNY9ofjDVyKjfC5H05I5TdtobvuifzY4B+WmgNWEncMt3aMHNZI+uQ00lmEBpa8HOLBKWyG5jbrMRBxT3yqhvwJBAnU= Authentication-Results: dkim=none (message not signed) header.d=none;dmarc=none action=none header.from=amd.com; Received: from DS0PR12MB7804.namprd12.prod.outlook.com (2603:10b6:8:142::5) by CYYPR12MB8655.namprd12.prod.outlook.com (2603:10b6:930:c4::19) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id 15.20.9275.15; Mon, 3 Nov 2025 08:59:25 +0000 Received: from DS0PR12MB7804.namprd12.prod.outlook.com ([fe80::e71b:4ec9:237f:24ea]) by DS0PR12MB7804.namprd12.prod.outlook.com ([fe80::e71b:4ec9:237f:24ea%5]) with mapi id 15.20.9275.013; Mon, 3 Nov 2025 08:59:25 +0000 Message-ID: Date: Mon, 3 Nov 2025 14:29:18 +0530 User-Agent: Mozilla Thunderbird Subject: Re: [PATCH v2] drm/amdgpu: fix gpu page fault after hibernation on PF passthrough To: Samuel Zhang Cc: victor.zhao@amd.com, haijun.chang@amd.com, Qing.Ma@amd.com, Owen.Zhang2@amd.com, amd-gfx@lists.freedesktop.org References: <20251103040558.1547518-1-guoqing.zhang@amd.com> Content-Language: en-US From: "Lazar, Lijo" In-Reply-To: <20251103040558.1547518-1-guoqing.zhang@amd.com> Content-Type: text/plain; charset=UTF-8; format=flowed Content-Transfer-Encoding: 7bit X-ClientProxiedBy: PN2PR01CA0049.INDPRD01.PROD.OUTLOOK.COM (2603:1096:c01:22::24) To DS0PR12MB7804.namprd12.prod.outlook.com (2603:10b6:8:142::5) MIME-Version: 1.0 X-MS-PublicTrafficType: Email X-MS-TrafficTypeDiagnostic: DS0PR12MB7804:EE_|CYYPR12MB8655:EE_ X-MS-Office365-Filtering-Correlation-Id: cf0ee567-52d4-4753-2ae6-08de1ab74951 X-MS-Exchange-SenderADCheck: 1 X-MS-Exchange-AntiSpam-Relay: 0 X-Microsoft-Antispam: BCL:0;ARA:13230040|376014|366016|1800799024; X-Microsoft-Antispam-Message-Info: =?utf-8?B?b3g0VzNOSk5NYTM5ZDQzQ0FGajJIV0p1ZHZweDkzblRQaGt3V2V0Y2RMc3o3?= =?utf-8?B?SzZPenpRdjExQSt3Q3lEeGsxbmlzRTZsTzhnWVNHdzMvNHJmYVZ2dXozUkRD?= =?utf-8?B?YnI1OXZKUEFxaXphdHBqZXRVMDljZnY4N2pQMVRxUjFuM3B3V0ZtMnpzTjha?= =?utf-8?B?Snp3SFlKVWxnR1lmSWlabHQ2Sm9sM2NoR2hHdUt2VkxyZlMwaC9aRzR1RVN1?= =?utf-8?B?WVNnZkRRdlkwTXRaeDdITnh1QmVYVE1ZeFo5WHQwNmdQM0NmbDBFWDB3RG53?= =?utf-8?B?YTFTYklYTjNrclFhSExKVFFXbWNkcUJrQmwrck8wNXdHRnV2UGhmMk5aR3hV?= =?utf-8?B?dURMODVkMS83SW1GTWVLSmdCcHdEK3R1K0VmekFoTWVFcHgralBaK1pSNHVR?= =?utf-8?B?ZzZ5VTBQQW5WdVJXbU0zaEFvam15WEt3dFpYUG01QmNZdWtKYjgzQTNnN0xG?= =?utf-8?B?aDdsbHMvNllTNCtxMHJtTWdoc1F6THpJVzFyMTRENm55NzFXUEhndW9IYTdt?= =?utf-8?B?MFZPUkwzMXpkUHltUVJSRWxIOSt0cnVuSE9DMUVMMVBIZ1kzZUQya3lqT283?= =?utf-8?B?RjRpeW5zTjRoNisvVElPTmdDYVB1ellxMVpSdTdpUWcvT01qcU5DVmd1blJt?= =?utf-8?B?dy9UZmhQTGhNKzYwUDk2ZjlsQ2hrV1c0ZHBlUGlXbFVPRFBxQ3ljM3RGTWZQ?= =?utf-8?B?eHBaMkxRaUxnTWlSZ3dJYUZ0dnFQdjVxbGNCUi9jUG5WTk9MRDhzY2pVUG1h?= =?utf-8?B?dGNsTlcyRkVjWXcrSmRzODNaeWV0ckFwR0NMYStnZGwvOUFiYUlBVWpFdEYy?= =?utf-8?B?L2NySytKbmFiUlRTbTVrbjk4MkxTeDRnWmlPakJGdTlKYUFvUm5hcys5UHlz?= =?utf-8?B?SlNGUjhOUm1VT0NCcWlLNTdCSWZRNitPd0dPeWM3VjVFMDNJWFExZTJ3cTFV?= =?utf-8?B?MUVhanhoRmo2a3AwUkhkRTlpNzF6blVjQllBZHBlQVZ0RExLNktQYnpIWGJM?= =?utf-8?B?OUwvT1NFNW5wNmk2UG1Ea2hJODJreHZFanF1THBxb2RyZms1MWkxTWExYTkx?= =?utf-8?B?WnZHQUw0TUw4b0d5RkRUSTBFaXhVYitWWVZhK0FTSnBpYXMxVE1NY1VtL3ZE?= =?utf-8?B?ejd0aElpcEl6V0lqdGpPQUVlR3loSTNybVdOcXJRaGNOVy9pSHVxd1B6MUpP?= =?utf-8?B?Mlk3UTQ4UDY3N3Blakx6SWVNSHJDYy9lVnJoU2JWaGkyWisvaThFSUlSQWhK?= =?utf-8?B?M29PeFcvZVlwUVMzdVR6dTJzUVVoRTlRbEZmczJHWWNRVVNTRXBkN0JyZHh5?= =?utf-8?B?SXpWbGlBSGpLTVNPQlNBdUI5NnhkTWppUVdJalRXVjJpV3J3RFFWWnFBbTQw?= =?utf-8?B?dWc2YnFqcXJKMlFNdW1UMFZGWVlGd0huaVdvNWFycU1lbTN0djB4ckFROThz?= =?utf-8?B?dHlnWTkyNTRTUDRNOWhTOHRYdFA2QTdjcUVDQkhLZzZHcm5VUHpOWjNUQzh2?= =?utf-8?B?WEhKZDY1cHhJQ29YMitTWUtkT3FhL2xMUWdNejBZSE9ZWUR4ZC9jNzJIOU5s?= =?utf-8?B?MUxpYjJvWkJzdEJlbHA0TWwxbnpZR0s1VVdyRW5nRjAya21lU25YejRSeE1L?= =?utf-8?B?MWsycVA2Rm4vZmJpUHlKeVFLZVlRUVVqbFZFYmcxR05Razh6SDZxdkEvWXRI?= =?utf-8?B?UExWT1h3aVluelBqWENxUjM2c21Od09DZEJ1UmFDYktHWlBOZmV5YWwwTk5j?= =?utf-8?B?aW1HTzFuMHVSZFJuYVdrN1p6cFRESUtyUEVQRmhTRXhDejFEQjhCWEhicW5X?= =?utf-8?B?Ry9zRG10allvN2xuV25zaXhEY3RGYjZDbjZVSmhhTkwrOFRCRUZyK2U0aTly?= =?utf-8?B?QVJyYktoQkJhWk9CN29iZWFwQ1poNzRGVmJBeXFoSFFsZW8yZzlZdGhrRW9Z?= =?utf-8?Q?J2YBltd+XIPr4WfbSDnFbAggn2UooV4H?= X-Forefront-Antispam-Report: CIP:255.255.255.255; CTRY:; LANG:en; SCL:1; SRV:; IPV:NLI; SFV:NSPM; H:DS0PR12MB7804.namprd12.prod.outlook.com; PTR:; CAT:NONE; SFS:(13230040)(376014)(366016)(1800799024); DIR:OUT; SFP:1101; X-MS-Exchange-AntiSpam-MessageData-ChunkCount: 1 X-MS-Exchange-AntiSpam-MessageData-0: =?utf-8?B?Z1hEV0U3a3FUR2wxUjJqUW05eFpTOWhuT3F5WUp4UEowamFoVzk5YlVxa3B4?= =?utf-8?B?V3VVb25pa05TR2hQZjN0Rk5DbWhnakV0bVN5R0UxSTc3bWpUdWp2bllGOHhB?= =?utf-8?B?R0dlakEwSk4wYTV0elNyTHpydlIrSnVPUVFYSVJCc3J4Vnl5UEdUNnhJUElU?= =?utf-8?B?K1lIRGVibG5CMTVXd0xJOTJ5RUJ6cXVTMWxzb2sxUkNLUDBpelFkQnNYWEZu?= =?utf-8?B?b1RVL05aY3UxMjczbXR2SmR0TUJmOGJyUmlFWUpubmQ4NDlmYUN4MHN1d1M0?= =?utf-8?B?SEpzL2Z3d0Y1c0g5REphVTlhZjNxelhFNTBsZlNXb3lpTHkwNnVpcVVrM3Z4?= =?utf-8?B?THVyM2pPdnVNU3drZjMzeHc0Z0hZV216VEhYRGRURUNzcjdwbTZCNjF1YU9l?= =?utf-8?B?MVQ1a2JaVnZqNklRV3ZPZkQyRTNiK29odHdTRUJDM2hVb28rWWV5WWtyVVNO?= =?utf-8?B?dmxOWWgxZWRMMWFHdHozQTBJZUhVcS8zMGhIY3B1TEs0eEQ1NzdPLzFWNVNX?= =?utf-8?B?dUtiQ1k5Ui9xdXVwaWZoTCtwQm1TSXpWZDIxQlIyYzJjSWI4L1hLMGI0ZnVt?= =?utf-8?B?Ny84SGxnVVk0QnY2SksxUVlHQmRvMjNPOFdXbHFOWVMzL0RZZWZ1SUN4MzRp?= =?utf-8?B?YmMzOU5COEJKTUM0Z2pLdU05ZjZpRW5sSGxzTEllWHFTdi9NNlVONFFCZExu?= =?utf-8?B?YTcxblFDTE81ckVzSmwvYnBKWjM5NjdNQ0ROR1VCZ1hrdWpJdU1OTDE5dUk3?= =?utf-8?B?UlpGc3NTUW93WDB0anhOblg0VTZteGF0TFJTcm84dDlLUUhERHp4NytJcWx3?= =?utf-8?B?Ui9JUTZKR04rYStxNFJtTEJ4YjJWRTNZbFExdE9lSTF5Yms1RU1oLzMyYi9J?= =?utf-8?B?akozRmFoZmsvdkk2Tkg5TVREbDlqYy82eGVsSG5PVlJsdVpYNTZGR2w0ZlAy?= =?utf-8?B?a3lPV1FwalF3bWhybHBVYXZoODNxZFhWeGVYelFTOVdsczFZR21FOGdrd2l5?= =?utf-8?B?TmJZYVphWDhkTWxubHU2QnUralNlbVNRSGFRaHVRMDUvUDRQenJzVnY4dnFw?= =?utf-8?B?VU1oWmxZTU1IdUs0YzBibEMreDJwSHJoMkhlQWUvUHVVc3JpTVVBdy9UZ25N?= =?utf-8?B?Um14WW03WTV4RDFWZFMzYVJiWDYzYXdFT2NZVHhMNmFqRkN3RU9SazdLTmVM?= =?utf-8?B?TTQrTUoxaEVLakxCYUQ2RDlRSlc1THVJOTN2bU52Z1J5TTdKNEpYalNLT0hF?= =?utf-8?B?aFhVNU1TYU5ERzJ4VUIrY3FteUx4RThycU85Q0FvNmlZaEMvTnJzM1dLcEJ2?= =?utf-8?B?WnFZb1pwK3N6QllDZXdQM2c1RnFNYTIyZFU4TnJhSHlCbEpyZE9aSnpwclpG?= =?utf-8?B?bjczT1R5UkI3eDJVTDZsbTdObDJMd2dpWm9YNU90czcwTXlEbWVYaU80Qy9M?= =?utf-8?B?L1RyL2pvOTRUcGJST3dqWjJZZmUyRnNDS2lkZEdDRlZQRnVnUnZ1ZStFUlEx?= =?utf-8?B?YzUvUXZGbGJxcWpxUWtNdVkvaW44U0RDM1pMd1AwUTZwSmpHc01ibWFJYTBL?= =?utf-8?B?azJUQ3lMaWdwUk9Xem1HSnlWaUF1dm9meGJNcGh3N3RRVGt3eFRCZGJSRFNk?= =?utf-8?B?dFYyeFRUV0w3amhVb3VEcThjekJraXcxK2ZCNEVHeWtUVENGUkg2Z080SEVK?= =?utf-8?B?VXdrdFNoc3UvWEVxWG9HOXJQcDVZUW5uL3ozMi93SVJhRHc1UExraE8zQktH?= =?utf-8?B?amp0ZHRPeExLTm81K2NyaitvRVQ0c3VxTkZvc3RGYVpqTkJaZ1dsL1NoMDB3?= =?utf-8?B?OFF3TFdqdi9iOGxHcmlCSkY1YjBtNmpCWERXc214YXR0WCtvVTk1VytwVFdr?= =?utf-8?B?ajBnSXZRcFZTc01nM1pMVkU2Y3cyQVBwdGdGVkFPNWZQVUdEMVdFNTRJZ1lT?= =?utf-8?B?QTkxbi9DZDdpeFJnL2p0dFBFL3NUUUR0S09TZGpZQVZIQzdZUmlmd1pMQ1ps?= =?utf-8?B?Y25ZM1VBOVQvNlZyV2tjS0I5a2VCRkErK2pFN0dNZFlPTUNXdFI2WU9Ub3c5?= =?utf-8?B?NkNiMlJEWVZmMm4rdlc3TUdHNVYydi9DSk5IdmJHamVLZWhZSzlnQTI4bytm?= =?utf-8?Q?uEWnA6Qo8x/LC015f5udBjwN1?= X-OriginatorOrg: amd.com X-MS-Exchange-CrossTenant-Network-Message-Id: cf0ee567-52d4-4753-2ae6-08de1ab74951 X-MS-Exchange-CrossTenant-AuthSource: DS0PR12MB7804.namprd12.prod.outlook.com X-MS-Exchange-CrossTenant-AuthAs: Internal X-MS-Exchange-CrossTenant-OriginalArrivalTime: 03 Nov 2025 08:59:24.9124 (UTC) X-MS-Exchange-CrossTenant-FromEntityHeader: Hosted X-MS-Exchange-CrossTenant-Id: 3dd8961f-e488-4e60-8e11-a82d994e183d X-MS-Exchange-CrossTenant-MailboxType: HOSTED X-MS-Exchange-CrossTenant-UserPrincipalName: GR7u9J2kt7ssXj+LNC5ayVRCRvu8ZP0R/gP4ZZmsmeJm+i6iqzPQ4Cq9nhModxPa X-MS-Exchange-Transport-CrossTenantHeadersStamped: CYYPR12MB8655 X-BeenThere: amd-gfx@lists.freedesktop.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: Discussion list for AMD gfx List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: amd-gfx-bounces@lists.freedesktop.org Sender: "amd-gfx" On 11/3/2025 9:35 AM, Samuel Zhang wrote: > On PF passthrough environment, after hibernate and then resume, coralgemm > will cause gpu page fault. > > Mode1 reset happens during hibernate, but partition mode is not restored > on resume, register mmCP_HYP_XCP_CTL and mmCP_PSP_XCP_CTL is not right > after resume. When CP access the MQD BO, wrong stride size is used, > this will cause out of bound access on the MQD BO, resulting page fault. > > The fix is to ensure gfx_v9_4_3_switch_compute_partition() is called > when resume from a hibernation. > > Signed-off-by: Samuel Zhang > --- > drivers/gpu/drm/amd/amdgpu/aqua_vanjaram.c | 3 ++- > drivers/gpu/drm/amd/amdgpu/gfx_v9_4_3.c | 4 +++- > 2 files changed, 5 insertions(+), 2 deletions(-) > > diff --git a/drivers/gpu/drm/amd/amdgpu/aqua_vanjaram.c b/drivers/gpu/drm/amd/amdgpu/aqua_vanjaram.c > index 811124ff88a8..75357e8a35b2 100644 > --- a/drivers/gpu/drm/amd/amdgpu/aqua_vanjaram.c > +++ b/drivers/gpu/drm/amd/amdgpu/aqua_vanjaram.c > @@ -407,7 +407,8 @@ static int aqua_vanjaram_switch_partition_mode(struct amdgpu_xcp_mgr *xcp_mgr, > return -EINVAL; > } > > - if (adev->kfd.init_complete && !amdgpu_in_reset(adev)) > + if (adev->kfd.init_complete && !amdgpu_in_reset(adev) && > + !adev->in_s4) The logic should be the same for a generic suspend scenario. Please make it in_suspend (here and in the cp_resume call). Thanks, Lijo > flags |= AMDGPU_XCP_OPS_KFD; > > if (flags & AMDGPU_XCP_OPS_KFD) { > diff --git a/drivers/gpu/drm/amd/amdgpu/gfx_v9_4_3.c b/drivers/gpu/drm/amd/amdgpu/gfx_v9_4_3.c > index c4c551ef6b87..a12c72213a79 100644 > --- a/drivers/gpu/drm/amd/amdgpu/gfx_v9_4_3.c > +++ b/drivers/gpu/drm/amd/amdgpu/gfx_v9_4_3.c > @@ -2291,7 +2291,9 @@ static int gfx_v9_4_3_cp_resume(struct amdgpu_device *adev) > r = amdgpu_xcp_init(adev->xcp_mgr, num_xcp, mode); > > } else { > - if (amdgpu_xcp_query_partition_mode(adev->xcp_mgr, > + if (adev->in_s4) /* Restore if resuming from suspend */ > + amdgpu_xcp_restore_partition_mode(adev->xcp_mgr); > + else if (amdgpu_xcp_query_partition_mode(adev->xcp_mgr, > AMDGPU_XCP_FL_NONE) == > AMDGPU_UNKNOWN_COMPUTE_PARTITION_MODE) > r = amdgpu_xcp_switch_partition_mode(