Linux PCI subsystem development
 help / color / mirror / Atom feed
From: Yazen Ghannam <yazen.ghannam@amd.com>
To: Michal Pecio <michal.pecio@gmail.com>
Cc: Shyam-sundar.S-k@amd.com, bhelgaas@google.com,
	hdegoede@redhat.com, ilpo.jarvinen@linux.intel.com,
	jdelvare@suse.com, linux-edac@vger.kernel.org,
	linux-hwmon@vger.kernel.org, linux-kernel@vger.kernel.org,
	linux-pci@vger.kernel.org, linux@roeck-us.net,
	mario.limonciello@amd.com, naveenkrishna.chatradhi@amd.com,
	platform-driver-x86@vger.kernel.org, suma.hegde@amd.com,
	tony.luck@intel.com, x86@kernel.org
Subject: Re: [PATCH v3 06/12] x86/amd_nb: Use topology info to get AMD node count
Date: Wed, 22 Oct 2025 09:39:01 -0400	[thread overview]
Message-ID: <20251022133901.GB7243@yaz-khff2.amd.com> (raw)
In-Reply-To: <20251022011610.60d0ba6e.michal.pecio@gmail.com>

On Wed, Oct 22, 2025 at 01:16:10AM +0200, Michal Pecio wrote:
> > Currently, the total AMD node count is determined by searching and
> > counting CPU/node devices using PCI IDs.
> > 
> > However, AMD node information is already available through topology
> > CPUID/MSRs. The recent topology rework has made this info easier to
> > access.
> > 
> > Replace the node counting code with a simple product of topology info.
> > 
> > Every node/northbridge is expected to have a 'misc' device. Clear
> > everything out if a 'misc' device isn't found on a node.
> 
> Hi,
> 
> I have a weird/buggy AM3 machine (Asus M4A88TD-M EVO, Phenom 965) where
> the kernel believes there are two packages and this assumption fails.
> 
> [    0.072051] CPU topo: Max. logical packages:   2
> [    0.072052] CPU topo: Max. logical dies:       2
> [    0.072052] CPU topo: Max. dies per package:   1
> [    0.072057] CPU topo: Max. threads per core:   1
> [    0.072058] CPU topo: Num. cores per package:     4
> [    0.072059] CPU topo: Num. threads per package:   4
> 
> It's currently on v6.12 series and working OK, but I remember trying
> v6.15 at one point and finding that EDAC and GART IOMMU were broken
> because the NB driver failed to initialize. This fixed it:
> 
> --- a/arch/x86/kernel/cpu/topology.c
> +++ b/arch/x86/kernel/cpu/topology.c
> @@ -496,8 +496,8 @@ void __init topology_init_possible_cpus(void)
>         total_cpus = allowed;
>         set_nr_cpu_ids(allowed);
>  
> -       cnta = domain_weight(TOPO_PKG_DOMAIN);
> -       cntb = domain_weight(TOPO_DIE_DOMAIN);
> +       cnta = 1;
> +       cntb = 1;
>         __max_logical_packages = cnta;
>         __max_dies_per_package = 1U << (get_count_order(cntb) - get_count_order(cnta));
> 
> It was a few weeks ago and the machine is currently back on v6.12,
> but I'm almost sure I tracked it down to this exact code:
> 
> > +	amd_northbridges.num = amd_num_nodes();
> > [...]
> > +		/*
> > +		 * Each Northbridge must have a 'misc' device.
> > +		 * If not, then uninitialize everything.
> > +		 */
> > +		if (!node_to_amd_nb(i)->misc) {
> > +			amd_northbridges.num = 0;
> > +			kfree(nb);
> > +			return -ENODEV;
> > +		}
> 

Hi Michal,

Thanks for reporting this.

Can you please share the full output from dmesg and lspci?

Also, can you please share the raw CPUID output (cpuid -r)?

Thanks,
Yazen

  reply	other threads:[~2025-10-22 13:39 UTC|newest]

Thread overview: 36+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2025-01-07 22:28 [PATCH v3 00/12] AMD NB and SMN rework Yazen Ghannam
2025-01-07 22:28 ` [PATCH v3 01/12] x86/amd_nb: Restrict init function to AMD-based systems Yazen Ghannam
2025-01-07 22:28 ` [PATCH v3 02/12] x86/amd_nb: Clean up early_is_amd_nb() Yazen Ghannam
2025-01-07 22:28 ` [PATCH v3 03/12] x86: Start moving AMD node functionality out of AMD_NB Yazen Ghannam
2025-01-07 22:28 ` [PATCH v3 04/12] x86/amd_nb: Simplify function 4 search Yazen Ghannam
2025-01-07 22:28 ` [PATCH v3 05/12] x86/amd_nb: Simplify root device search Yazen Ghannam
2025-01-07 22:28 ` [PATCH v3 06/12] x86/amd_nb: Use topology info to get AMD node count Yazen Ghannam
2025-10-21 23:16   ` Michal Pecio
2025-10-22 13:39     ` Yazen Ghannam [this message]
2025-10-22 15:38       ` Michal Pecio
2025-10-22 16:04         ` Guenter Roeck
2025-10-22 16:09         ` Yazen Ghannam
2025-10-22 16:18           ` Michal Pecio
2025-10-23 13:59             ` Yazen Ghannam
2025-10-23 15:01               ` Michal Pecio
2025-10-23 16:09                 ` Yazen Ghannam
2025-10-23 16:22                   ` Mario Limonciello
2025-10-23 17:06                     ` Michal Pecio
2025-10-23 17:12                       ` Mario Limonciello
2025-10-23 18:25                         ` Yazen Ghannam
2025-10-23 21:43                           ` Mario Limonciello
2025-10-23 16:31                   ` Michal Pecio
2025-10-23 18:15                     ` Yazen Ghannam
2025-10-23 18:25                   ` Michal Pecio
2025-10-23 19:04                     ` Yazen Ghannam
2025-10-23 19:09                       ` Yazen Ghannam
2025-10-24  8:48                         ` Michal Pecio
2025-10-24 13:42                           ` Yazen Ghannam
2025-01-07 22:28 ` [PATCH v3 07/12] x86/amd_nb: Simplify function 3 search Yazen Ghannam
2025-01-07 22:28 ` [PATCH v3 08/12] x86/amd_nb, hwmon: (k10temp): Simplify amd_pci_dev_to_node_id() Yazen Ghannam
2025-01-07 22:28 ` [PATCH v3 09/12] x86/amd_nb: Move SMN access code to a new amd_node driver Yazen Ghannam
2025-01-08  5:30   ` Shyam Sundar S K
2025-01-08  8:56     ` Borislav Petkov
2025-01-07 22:28 ` [PATCH v3 10/12] x86/amd_node: Update __amd_smn_rw() error paths Yazen Ghannam
2025-01-07 22:28 ` [PATCH v3 11/12] x86/amd_node: Remove dependency on AMD_NB Yazen Ghannam
2025-01-07 22:28 ` [PATCH v3 12/12] x86/amd_node: Use defines for SMN register offsets Yazen Ghannam

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20251022133901.GB7243@yaz-khff2.amd.com \
    --to=yazen.ghannam@amd.com \
    --cc=Shyam-sundar.S-k@amd.com \
    --cc=bhelgaas@google.com \
    --cc=hdegoede@redhat.com \
    --cc=ilpo.jarvinen@linux.intel.com \
    --cc=jdelvare@suse.com \
    --cc=linux-edac@vger.kernel.org \
    --cc=linux-hwmon@vger.kernel.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-pci@vger.kernel.org \
    --cc=linux@roeck-us.net \
    --cc=mario.limonciello@amd.com \
    --cc=michal.pecio@gmail.com \
    --cc=naveenkrishna.chatradhi@amd.com \
    --cc=platform-driver-x86@vger.kernel.org \
    --cc=suma.hegde@amd.com \
    --cc=tony.luck@intel.com \
    --cc=x86@kernel.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox