From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from gabe.freedesktop.org (gabe.freedesktop.org [131.252.210.177]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.lore.kernel.org (Postfix) with ESMTPS id 5CB60C36010 for ; Fri, 11 Apr 2025 14:39:21 +0000 (UTC) Received: from gabe.freedesktop.org (localhost [127.0.0.1]) by gabe.freedesktop.org (Postfix) with ESMTP id BCEAA10EBD8; Fri, 11 Apr 2025 14:39:20 +0000 (UTC) Authentication-Results: gabe.freedesktop.org; dkim=pass (2048-bit key; unprotected) header.d=collabora.com header.i=@collabora.com header.b="eyWDagwq"; dkim-atps=neutral Received: from bali.collaboradmins.com (bali.collaboradmins.com [148.251.105.195]) by gabe.freedesktop.org (Postfix) with ESMTPS id 0914210EBD8; Fri, 11 Apr 2025 14:39:14 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=collabora.com; s=mail; t=1744382349; bh=l/FyZ1QgxZqZ0a2k0PLPYLuGR2w0ztn5B4zd/aB1AaA=; h=Date:From:To:Cc:Subject:In-Reply-To:References:From; b=eyWDagwqvnK+2gL6NuOfdQT8YD5pXRa25qRTeiT6jcG4pN2IPZ5gIQEzQqve8afII qMu/KqaTop0/BLUXU2BIRmJjEsAGanun2dAbyui09vYrHSRNozpz1+wS6607/g10ql c9HuLbEu/BWUcvm67XDXGkaM2uOfEgX3DG6IKBYh4qikjDGcPRNRZouhk87V5UQUKe T3xo1VOmAYJLokhcSUf/hoSyPgRP/Ocq9dVBL/jwl3hw06SCwHBWOIR5zlO0WzBRou 5fXviMRW5lsO7hpUPgDC9yUyEmtXwCxO6F2M0JF9zpDqpoBApCn4ePxOT1Tg53g2oJ ZC5EOMC53EdFw== Received: from localhost (unknown [IPv6:2a01:e0a:2c:6930:5cf4:84a1:2763:fe0d]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature RSA-PSS (4096 bits) server-digest SHA256) (No client certificate requested) (Authenticated sender: bbrezillon) by bali.collaboradmins.com (Postfix) with ESMTPSA id E68D817E014F; Fri, 11 Apr 2025 16:39:08 +0200 (CEST) Date: Fri, 11 Apr 2025 16:39:02 +0200 From: Boris Brezillon To: Christian =?UTF-8?B?S8O2bmln?= Cc: Alyssa Rosenzweig , Steven Price , Liviu Dudau , =?UTF-8?B?QWRy?= =?UTF-8?B?acOhbg==?= Larumbe , lima@lists.freedesktop.org, Qiang Yu , David Airlie , Simona Vetter , Maarten Lankhorst , Maxime Ripard , Thomas Zimmermann , dri-devel@lists.freedesktop.org, Dmitry Osipenko , kernel@collabora.com, Faith Ekstrand Subject: Re: [PATCH v3 0/8] drm: Introduce sparse GEM shmem Message-ID: <20250411163902.1d0db9da@collabora.com> In-Reply-To: References: <20250404092634.2968115-1-boris.brezillon@collabora.com> <20250410164809.21109cbc@collabora.com> <20250410175349.6bf6a4ea@collabora.com> <20250410192054.24a592a5@collabora.com> <20250410204155.55d5cfc7@collabora.com> <4d47cb90-8ed4-4a69-bd91-b90ebd2c9aca@amd.com> <20250411103837.753cd92e@collabora.com> <9fd6fb8c-7dbb-467d-a759-eec852e1f006@amd.com> <20250411140254.042f9862@collabora.com> <73a28f35-9576-4089-8976-07cd1aa64432@amd.com> <20250411150056.62cb7042@collabora.com> Organization: Collabora X-Mailer: Claws Mail 4.3.0 (GTK 3.24.43; x86_64-redhat-linux-gnu) MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: quoted-printable X-BeenThere: dri-devel@lists.freedesktop.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: Direct Rendering Infrastructure - Development List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: dri-devel-bounces@lists.freedesktop.org Sender: "dri-devel" On Fri, 11 Apr 2025 15:13:26 +0200 Christian K=C3=B6nig wrote: > > =20 > >> Background is that you don't get a crash, nor error message, nor > >> anything indicating what is happening. =20 > > The job times out at some point, but we might get stuck in the fault > > handler waiting for memory, which is pretty close to a deadlock, I > > suspect. =20 >=20 > I don't know those drivers that well, but at least for amdgpu the > problem would be that the timeout handling would need to grab some of > the locks the memory management is holding waiting for the timeout > handling to do something.... >=20 > So that basically perfectly closes the circle. With a bit of lock you > get a message after some time that the kernel is stuck, but since > that are all sleeping locks I strongly doubt so. >=20 > As immediately action please provide patches which changes those > GFP_KERNEL into GFP_NOWAIT. Sure, I can do that.