From mboxrd@z Thu Jan 1 00:00:00 1970 Received: from mgamail.intel.com (mgamail.intel.com [192.198.163.19]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id 17F34382397 for ; Tue, 21 Apr 2026 11:54:26 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=192.198.163.19 ARC-Seal:i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1776772468; cv=none; b=cbGYzBMzo82kGpx80w/ic5s7S0CKrBngQ+q74SdE48ySi3ldCqyEmF2HRISUcpWvSMymDCgvaPxwKEmWKNaWyZ5DGL3QisBOObDdL0/0dJ17Lo4VYS1iyP59p2YMczELssYbgI0H+s6lwD9Gmnthl7wjmOKLTR1ZZ+3BGQ5Q/v0= ARC-Message-Signature:i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1776772468; c=relaxed/simple; bh=iCgphbDcBdNyS1sSqpE8VpK1xeMLyIcHxuNcGysEPt0=; h=Message-ID:Date:MIME-Version:Subject:To:Cc:References:From: In-Reply-To:Content-Type; b=VvyN1I3sNSZjnAIv6ATs4dVaUmPhSBHOVeli0Wh4Puc9yPdOELe0BfXT4AjTXEMCClDid/9TgSd3r93hOrR83TDILDIkjG5ixPvqaroRei+DwUO6WDeoe0BH3SjCZRB7VXxTKBAwxn+GXoaZL1LZ7+kEjhVq6/Mrg52igMN0ZzU= ARC-Authentication-Results:i=1; smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=linux.intel.com; spf=pass smtp.mailfrom=linux.intel.com; dkim=pass (2048-bit key) header.d=intel.com header.i=@intel.com header.b=lm/DFZOJ; arc=none smtp.client-ip=192.198.163.19 Authentication-Results: smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=linux.intel.com Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=linux.intel.com Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=intel.com header.i=@intel.com header.b="lm/DFZOJ" DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=intel.com; i=@intel.com; q=dns/txt; s=Intel; t=1776772467; x=1808308467; h=message-id:date:mime-version:subject:to:cc:references: from:in-reply-to:content-transfer-encoding; bh=iCgphbDcBdNyS1sSqpE8VpK1xeMLyIcHxuNcGysEPt0=; b=lm/DFZOJ9/YqyFnSwfCyAoXGchv8AonNfk9YHqjHlwxY8mpqD2DbKtZF PX0umWqtgQx0BWmBzxRzP2CI8ef803VRtg7Oor3KebJ3bzN1pyAM2N6mq 3DiGeRJ3+Sxe3GbeKqf2VWoTfrPTsmZKVs0WT06YtcaqmkSNXTxerH1Hj NxIED4ay0LByLvacXZp89Ng86BMfbICDZKS/fYPT8ryyf1OMJlmja6nPL qLaLogsIxYHypxVIM8t4jy7cByDf/7d/+NQJKeXZ5yvlgdXQHQXhu8ddj 6JncI0ezUygWS2/6cd6WfyrbHD3Q4qYVbfiqg9kMVFNTExhLXmZm5zNDa Q==; X-CSE-ConnectionGUID: nxYkiA/HQQqgPowbfDkExg== X-CSE-MsgGUID: hdJEHpN7SW+QLt+pZpi5/w== X-IronPort-AV: E=McAfee;i="6800,10657,11762"; a="76733434" X-IronPort-AV: E=Sophos;i="6.23,191,1770624000"; d="scan'208";a="76733434" Received: from orviesa010.jf.intel.com ([10.64.159.150]) by fmvoesa113.fm.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 21 Apr 2026 04:54:26 -0700 X-CSE-ConnectionGUID: VubRT5JtSDiNCWy/slIh2A== X-CSE-MsgGUID: SpkNe7dCSlS+hlep7bO+tw== X-ExtLoop1: 1 X-IronPort-AV: E=Sophos;i="6.23,191,1770624000"; d="scan'208";a="231127258" Received: from cliu5-mobl.gar.corp.intel.com (HELO [10.125.160.168]) ([10.125.160.168]) by orviesa010-auth.jf.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 21 Apr 2026 04:54:26 -0700 Message-ID: Date: Tue, 21 Apr 2026 04:54:24 -0700 Precedence: bulk X-Mailing-List: linux-kernel@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 User-Agent: Mozilla Thunderbird Subject: Re: [PATCH] drm/amdgpu: fix zero-size GDS range init on RDNA4 To: =?UTF-8?Q?Christian_K=C3=B6nig?= , amd-gfx@lists.freedesktop.org Cc: Alex Deucher , dri-devel@lists.freedesktop.org, linux-kernel@vger.kernel.org References: <20260420215717.223372-1-arjan@linux.intel.com> <34718f21-712a-4161-98e0-079dd9390ae6@amd.com> Content-Language: en-US From: Arjan van de Ven In-Reply-To: <34718f21-712a-4161-98e0-079dd9390ae6@amd.com> Content-Type: text/plain; charset=UTF-8; format=flowed Content-Transfer-Encoding: 8bit On 4/20/2026 11:42 PM, Christian König wrote: > On 4/20/26 23:57, arjan@linux.intel.com wrote: >> >> RDNA4 (GFX 12) hardware removes the GDS, GWS, and OA on-chip memory >> resources. The gfx_v12_0 initialisation code correctly leaves >> adev->gds.gds_size, adev->gds.gws_size, and adev->gds.oa_size at >> zero to reflect this. >> >> amdgpu_ttm_init() unconditionally calls amdgpu_ttm_init_on_chip() for >> each of these resources regardless of size. When the size is zero, >> amdgpu_ttm_init_on_chip() forwards the call to ttm_range_man_init(), >> which calls drm_mm_init(mm, 0, 0). drm_mm_init() immediately fires >> DRM_MM_BUG_ON(start + size <= start) -- trivially true when size is >> zero -- crashing the kernel during modprobe of amdgpu on an RX 9070 XT. > > Mhm in general not a bad idea, but we are having tons of GFX 12 systems in our test machines and nothing is crashing there. > > We are clearly missing something here. Is that on an upstream kernel or something backported? > the reported oops/etc say 6.18.22 so that does not sound like something crazy backported (https://bugzilla.kernel.org/show_bug.cgi?id=221376)