* RFC: /proc/cpuinfo confusion with AMD processors
@ 2014-06-30 12:50 Prarit Bhargava
2014-06-30 13:13 ` Borislav Petkov
0 siblings, 1 reply; 7+ messages in thread
From: Prarit Bhargava @ 2014-06-30 12:50 UTC (permalink / raw)
To: Linux Kernel; +Cc: the arch/x86 maintainers, Andrew Morton
AMD defines a "Package" as the hardware processor itself. Each Package contains
multiple Nodes, and each Node has multiple Compute Units. Each Compute Unit can
have up to 2 cores that [with the 62xx and 63xx] do not have multiple Threads.
That is, to determine the number of CPUs that Linux sees, multiply
Package * Nodes * Compute Units * Cores
Note that Nodes and Compute Units are not indicated in /proc/cpuinfo directly
(although it could be argued that they should be).
The output of /proc/cpuinfo is confusing at this point as ...
processor : 31
vendor_id : AuthenticAMD
cpu family : 21
model : 2
model name : AMD Opteron(tm) Processor 6386 SE
stepping : 0
microcode : 0x6000822
cpu MHz : 2800.000
cache size : 2048 KB
physical id : 1
siblings : 16 <<< this is number of threads per package
core id : 7 <<< this is the core id of this thread relative to node
cpu cores : 8 <<< this is the number of cores per node
which makes deciphering the system topology quite difficult as values are
relative to both nodes and the entire package. It is not possible using this
information to uniquely identify a processor.
I want to make two changes. The first is to modify /proc/cpuinfo to include
both the node information for all x86 processors and also include a "compute
unit id" field for AMD systems.
The second change would be to include the same data in /sys which currently does
not contain the above information in the topology directory.
Thoughts/concerns?
P.
^ permalink raw reply [flat|nested] 7+ messages in thread* Re: RFC: /proc/cpuinfo confusion with AMD processors 2014-06-30 12:50 RFC: /proc/cpuinfo confusion with AMD processors Prarit Bhargava @ 2014-06-30 13:13 ` Borislav Petkov 2014-06-30 13:29 ` Prarit Bhargava 0 siblings, 1 reply; 7+ messages in thread From: Borislav Petkov @ 2014-06-30 13:13 UTC (permalink / raw) To: Prarit Bhargava; +Cc: Linux Kernel, the arch/x86 maintainers, Andrew Morton On Mon, Jun 30, 2014 at 08:50:47AM -0400, Prarit Bhargava wrote: > AMD defines a "Package" as the hardware processor itself. Each Package contains > multiple Nodes, and each Node has multiple Compute Units. Each Compute Unit can > have up to 2 cores that [with the 62xx and 63xx] do not have multiple Threads. > > That is, to determine the number of CPUs that Linux sees, multiply > > Package * Nodes * Compute Units * Cores > > Note that Nodes and Compute Units are not indicated in /proc/cpuinfo directly > (although it could be argued that they should be). > > The output of /proc/cpuinfo is confusing at this point as ... > > > processor : 31 > vendor_id : AuthenticAMD > cpu family : 21 > model : 2 > model name : AMD Opteron(tm) Processor 6386 SE > stepping : 0 > microcode : 0x6000822 > cpu MHz : 2800.000 > cache size : 2048 KB > physical id : 1 > siblings : 16 <<< this is number of threads per package > core id : 7 <<< this is the core id of this thread relative to node > cpu cores : 8 <<< this is the number of cores per node siblings / cpu cores = threads per compute unit. > which makes deciphering the system topology quite difficult as values are > relative to both nodes and the entire package. It is not possible using this > information to uniquely identify a processor. To do what with that information? What is the task you're trying to accomplish? > Thoughts/concerns? BIOS does all kinds of hacks and renumbering to accomodate the brainf*cked design of other OSes so this info you're trying to put in cpuinfo might turn to be completely misleading utterly useless in some cases. -- Regards/Gruss, Boris. Sent from a fat crate under my desk. Formatting is fine. -- ^ permalink raw reply [flat|nested] 7+ messages in thread
* Re: RFC: /proc/cpuinfo confusion with AMD processors 2014-06-30 13:13 ` Borislav Petkov @ 2014-06-30 13:29 ` Prarit Bhargava 2014-06-30 13:38 ` Borislav Petkov 0 siblings, 1 reply; 7+ messages in thread From: Prarit Bhargava @ 2014-06-30 13:29 UTC (permalink / raw) To: Borislav Petkov; +Cc: Linux Kernel, the arch/x86 maintainers, Andrew Morton On 06/30/2014 09:13 AM, Borislav Petkov wrote: > On Mon, Jun 30, 2014 at 08:50:47AM -0400, Prarit Bhargava wrote: >> AMD defines a "Package" as the hardware processor itself. Each Package contains >> multiple Nodes, and each Node has multiple Compute Units. Each Compute Unit can >> have up to 2 cores that [with the 62xx and 63xx] do not have multiple Threads. >> >> That is, to determine the number of CPUs that Linux sees, multiply >> >> Package * Nodes * Compute Units * Cores >> >> Note that Nodes and Compute Units are not indicated in /proc/cpuinfo directly >> (although it could be argued that they should be). >> >> The output of /proc/cpuinfo is confusing at this point as ... >> >> >> processor : 31 >> vendor_id : AuthenticAMD >> cpu family : 21 >> model : 2 >> model name : AMD Opteron(tm) Processor 6386 SE >> stepping : 0 >> microcode : 0x6000822 >> cpu MHz : 2800.000 >> cache size : 2048 KB >> physical id : 1 >> siblings : 16 <<< this is number of threads per package >> core id : 7 <<< this is the core id of this thread relative to node >> cpu cores : 8 <<< this is the number of cores per node > > siblings / cpu cores = threads per compute unit. Yes, I get that. But this doesn't uniquely identify *which* processor it is. > >> which makes deciphering the system topology quite difficult as values are >> relative to both nodes and the entire package. It is not possible using this >> information to uniquely identify a processor. > > To do what with that information? What is the task you're trying to > accomplish? Admins load systems relative to nodes, and look at /proc/cpuinfo as a "known" place to get that info. At the end of the day I'd like to be able to human-read /proc/cpuinfo and get the correct data out of it without jumping through hoops to get processor location information. I can, in theory, use the initial apicid and the apicid to map everything out ... but I shouldn't have to. /proc/cpuinfo provides an easy look up for this data for Intel; why not for AMD? Additionally the turbostat utility, for example, is broken because of this lack of info. It assumes that /sys/../cpu/cpuX/topology/core_id is *per package* when on AMD systems it is reported as per node. > >> Thoughts/concerns? > > BIOS does all kinds of hacks and renumbering to accomodate the > brainf*cked design of other OSes so this info you're trying to put in > cpuinfo might turn to be completely misleading utterly useless in some > cases. I read that exact same thing somewhere else. The argument you're making is that it might be broken for some other reason. It already is completely misleading and utterly useless in the above case. A simple test patch to arch/x86/kernel/cpu/proc.c shows that including the information above is trivial. Adding it /sys/.../cpu/cpuX/topology is a little bit more difficult. P. > ^ permalink raw reply [flat|nested] 7+ messages in thread
* Re: RFC: /proc/cpuinfo confusion with AMD processors 2014-06-30 13:29 ` Prarit Bhargava @ 2014-06-30 13:38 ` Borislav Petkov 2014-06-30 14:07 ` Prarit Bhargava 2014-07-02 22:01 ` Pavel Machek 0 siblings, 2 replies; 7+ messages in thread From: Borislav Petkov @ 2014-06-30 13:38 UTC (permalink / raw) To: Prarit Bhargava; +Cc: Linux Kernel, the arch/x86 maintainers, Andrew Morton On Mon, Jun 30, 2014 at 09:29:05AM -0400, Prarit Bhargava wrote: > Yes, I get that. But this doesn't uniquely identify *which* processor > it is. What do you mean, which processor it is? You want to know which processor on the motherboard, physically? When you look at the mobo and if all is enumerated regularly and you have a, say, two socket system with the processors above, to be able to say that processor 31 is the 16th core on the second node on the motherboard? Something like that? > Admins load systems relative to nodes, and look at /proc/cpuinfo as a > "known" place to get that info. At the end of the day I'd like to be > able to human-read /proc/cpuinfo and get the correct data out of it > without jumping through hoops What correct data? All that's there is correct AFAICT. Give an example of what you want to do? > to get processor location information. I can, in theory, use the > initial apicid and the apicid to map everything out ... but I > shouldn't have to. /proc/cpuinfo provides an easy look up for this > data for Intel; why not for AMD? What does it show on Intel that's clear there and that's puzzling you on AMD? > Additionally the turbostat utility, for example, is broken because of > this lack of info. It assumes that /sys/../cpu/cpuX/topology/core_id > is *per package* when on AMD systems it is reported as per node. The turbostat utility might need some testing and fixing on AMD. If you can get some resources to do that I think everyone will profit from it. > I read that exact same thing somewhere else. The argument you're > making is that it might be broken for some other reason. It already is > completely misleading and utterly useless in the above case. A simple > test patch to arch/x86/kernel/cpu/proc.c shows that including the > information above is trivial. Adding it /sys/.../cpu/cpuX/topology is > a little bit more difficult. And I'm still saying that that information - albeit being trivial to add - might be lying on some system doing renumbering. So, again: what exactly do you want to be able to figure out from /proc/cpuinfo and what is puzzling you? Give concrete examples please... Thanks. -- Regards/Gruss, Boris. Sent from a fat crate under my desk. Formatting is fine. -- ^ permalink raw reply [flat|nested] 7+ messages in thread
* Re: RFC: /proc/cpuinfo confusion with AMD processors 2014-06-30 13:38 ` Borislav Petkov @ 2014-06-30 14:07 ` Prarit Bhargava 2014-06-30 18:27 ` Borislav Petkov 2014-07-02 22:01 ` Pavel Machek 1 sibling, 1 reply; 7+ messages in thread From: Prarit Bhargava @ 2014-06-30 14:07 UTC (permalink / raw) To: Borislav Petkov; +Cc: Linux Kernel, the arch/x86 maintainers, Andrew Morton On 06/30/2014 09:38 AM, Borislav Petkov wrote: > On Mon, Jun 30, 2014 at 09:29:05AM -0400, Prarit Bhargava wrote: >> Yes, I get that. But this doesn't uniquely identify *which* processor >> it is. > > What do you mean, which processor it is? You want to know which > processor on the motherboard, physically? When you look at the mobo and > if all is enumerated regularly and you have a, say, two socket system > with the processors above, to be able to say that processor 31 is the > 16th core on the second node on the motherboard? Something like that? Sorry, yes, exactly that. Requests have come in where an admin is setting up system loads relative to specific nodes and cores. Determining that information is trivial on Intel and a lot more difficult on AMD. (see below) >> to get processor location information. I can, in theory, use the >> initial apicid and the apicid to map everything out ... but I >> shouldn't have to. /proc/cpuinfo provides an easy look up for this >> data for Intel; why not for AMD? > > What does it show on Intel that's clear there and that's puzzling you on > AMD? Intel does not have the concept of multi-node packages (AFAIK). So the Intel existing physical id and core id (both of which are relative to the package) is good enough to determine which core it is. For example from an Intel box processor : 3 ... physical id : 3 siblings : 2 core id : 1 cpu cores : 2 tells me that this processor 3 is on socket/package (physical id) 3, core 1 out of 2, and the cores are not hyperthreaded. On a system in which an admin would like to (for example) set cpu affinity for a VM or a particular application knowing this information is useful. On an AMD system, processor : 31 physical id : 1 siblings : 16 core id : 7 cpu cores : 8 implies (using the logic above), package/socket 1, core 7/8 cores, and multi-threading is on ... which is incorrect. The package in question is 16 thread and 16 core with no multi-threading. The difference in the result occurs because AMD is (IMO) erroneously reporting per node information. Another view of this could be that on AMD systems we should modify the output to report per package information to make it consistent with Intel (and other arches ... ppc reports per socket as does s390. I haven't checked ARM yet). AMD is the outlier here. > >> Additionally the turbostat utility, for example, is broken because of >> this lack of info. It assumes that /sys/../cpu/cpuX/topology/core_id >> is *per package* when on AMD systems it is reported as per node. > > The turbostat utility might need some testing and fixing on AMD. If you > can get some resources to do that I think everyone will profit from it. The problem isn't turbostat though -- it is that core_id implies two different things for two different x86 processors. Intel reports that value as per package, while AMD reports it as per node (and there is no way to determine which node it is). The problem is that the values are ambiguous relative to processor vendor and they shouldn't be. P. ^ permalink raw reply [flat|nested] 7+ messages in thread
* Re: RFC: /proc/cpuinfo confusion with AMD processors 2014-06-30 14:07 ` Prarit Bhargava @ 2014-06-30 18:27 ` Borislav Petkov 0 siblings, 0 replies; 7+ messages in thread From: Borislav Petkov @ 2014-06-30 18:27 UTC (permalink / raw) To: Prarit Bhargava; +Cc: Linux Kernel, the arch/x86 maintainers, Andrew Morton On Mon, Jun 30, 2014 at 10:07:57AM -0400, Prarit Bhargava wrote: > Sorry, yes, exactly that. Requests have come in where an admin > is setting up system loads relative to specific nodes and cores. > Determining that information is trivial on Intel and a lot more > difficult on AMD. (see below) Yeah, that might not always work, see below. > Intel does not have the concept of multi-node packages (AFAIK). Let's establish terminology first: So I think you mean MCM - Multi-Chip Module where you have more than one nodes, right? https://en.wikipedia.org/wiki/Multi-chip_module And according to the wiki article, Intel has that too. HOWEVER(!), it is a whole another question whether that is made visible to the software! > So the Intel > existing physical id and core id (both of which are relative to the package) is > good enough to determine which core it is. For example from an Intel box > > > processor : 3 > ... > physical id : 3 > siblings : 2 > core id : 1 > cpu cores : 2 > > > > tells me that this processor 3 is on socket/package (physical id) 3, core 1 out > of 2, and the cores are not hyperthreaded. On a system in which an admin would > like to (for example) set cpu affinity for a VM or a particular application > knowing this information is useful. > On an AMD system, > > processor : 31 > physical id : 1 > siblings : 16 > core id : 7 > cpu cores : 8 > > implies (using the logic above), package/socket 1, Yes, the second socket. > core 7/8 cores, and multi-threading is on ... which is incorrect. What says that multithreading is on? How do you decide that? I think you're reading too much into it. > The package in question is 16 thread and 16 core with no multi-threading. No, this is wrong. So you are on socket 1 ("physical id"), it has 16 siblings, i.e. threads. In this case, 16 threads means 16/2 = 8 compute units. And core id 7 means the id of the the core on this NUMA node. But just to show you that this information can be misleading, let's look at another AMD box: processor : 30 cpu family : 21 model : 1 model name : AMD Opteron(TM) Processor 6272 stepping : 2 microcode : 0x600063d cpu MHz : 1400.000 cache size : 2048 KB physical id : 0 siblings : 16 core id : 7 cpu cores : 8 apicid : 47 initial apicid : 15 processor : 31 cpu family : 21 model : 1 model name : AMD Opteron(TM) Processor 6272 stepping : 2 microcode : 0x600063d cpu MHz : 1400.000 cache size : 2048 KB physical id : 1 siblings : 16 core id : 7 cpu cores : 8 apicid : 79 initial apicid : 47 So those are the last two cores, 30 and 31. Now look at physical id - 30 is on node 0 and 31 is on node 1. So thread 30 is on physical package 0 and thread 31 is on physical package 1, i.e. on different physical sockets. And yet, all the info there is correct - it is just not complete for your purposes. If you want to have the detailed information you're interested in, simply do: $ grep . /sys/devices/system/cpu/cpu30/topology/* /sys/devices/system/cpu/cpu30/topology/core_id:7 /sys/devices/system/cpu/cpu30/topology/core_siblings:55555555 /sys/devices/system/cpu/cpu30/topology/core_siblings_list:0,2,4,6,8,10,12,14,16,18,20,22,24,26,28,30 All the core siblings, i.e. threads on this node, i.e. package 0. /sys/devices/system/cpu/cpu30/topology/physical_package_id:0 This is the physical CPU package 0 on the socket in the motherboard. /sys/devices/system/cpu/cpu30/topology/thread_siblings:50000000 /sys/devices/system/cpu/cpu30/topology/thread_siblings_list:28,30 This is thread 30 in a compute unit together with thread 28. $ grep . /sys/devices/system/cpu/cpu31/topology/* /sys/devices/system/cpu/cpu31/topology/core_id:7 /sys/devices/system/cpu/cpu31/topology/core_siblings:aaaaaaaa /sys/devices/system/cpu/cpu31/topology/core_siblings_list:1,3,5,7,9,11,13,15,17,19,21,23,25,27,29,31 /sys/devices/system/cpu/cpu31/topology/physical_package_id:1 This is the physical CPU package 1 on the socket in the motherboard. /sys/devices/system/cpu/cpu31/topology/thread_siblings:a0000000 /sys/devices/system/cpu/cpu31/topology/thread_siblings_list:29,31 This is thread 31 in a compute unit with thread 29 together. At the time when we were talking about adding that info to /proc/cpuinfo, the intention was declined by reviewers and it was said that we have the topology hierarchy in sysfs where it all should be. And I agree now, btw... > The difference in the result occurs because AMD is (IMO) erroneously > reporting per node information. Another view of this could be that > on AMD systems we should modify the output to report per package > information to make it consistent with Intel (and other arches ... ppc > reports per socket as does s390. I haven't checked ARM yet). AMD is > the outlier here. ... you can't express complicated topologies in /proc/cpuinfo. This is nicely done in sysfs where all tools can access it. > The problem isn't turbostat though -- it is that core_id implies two > different things for two different x86 processors. Intel reports that > value as per package, while AMD reports it as per node (and there is > no way to determine which node it is). No, you're making the wrong assumption that a single package can have only one NUMA node. Which is basically not true on AMD. core_id is the id of the core on that NUMA node. It means the same on Intel *and* AMD. IF you have more than one nodes in a package, the core_id is still correct as it is the numbering within the NUMA node. Again, /proc/cpuinfo is not the proper place for complicated topologies, at least on x86. For NUMA we have different tools like numactl which give you that precise information of the system topology and there was this other tool which even generates PDFs of the topology. I'm forgetting the name right now though. HTH. -- Regards/Gruss, Boris. Sent from a fat crate under my desk. Formatting is fine. -- ^ permalink raw reply [flat|nested] 7+ messages in thread
* Re: RFC: /proc/cpuinfo confusion with AMD processors 2014-06-30 13:38 ` Borislav Petkov 2014-06-30 14:07 ` Prarit Bhargava @ 2014-07-02 22:01 ` Pavel Machek 1 sibling, 0 replies; 7+ messages in thread From: Pavel Machek @ 2014-07-02 22:01 UTC (permalink / raw) To: Borislav Petkov Cc: Prarit Bhargava, Linux Kernel, the arch/x86 maintainers, Andrew Morton On Mon 2014-06-30 15:38:03, Borislav Petkov wrote: > On Mon, Jun 30, 2014 at 09:29:05AM -0400, Prarit Bhargava wrote: > > Yes, I get that. But this doesn't uniquely identify *which* processor > > it is. > > What do you mean, which processor it is? You want to know which > processor on the motherboard, physically? When you look at the mobo and > if all is enumerated regularly and you have a, say, two socket system > with the processors above, to be able to say that processor 31 is the > 16th core on the second node on the motherboard? Something like that? I'd say so. Some machines support physical cpu hotplug, and it would be useful to know which cpu you are plugging out... (For example if you want to pull out / replace the cpu that has problems... Or maybe more realistically you want to offline a CPU so that you can replace a noisy CPU fan.) Pavel -- (english) http://www.livejournal.com/~pavelmachek (cesky, pictures) http://atrey.karlin.mff.cuni.cz/~pavel/picture/horses/blog.html ^ permalink raw reply [flat|nested] 7+ messages in thread
end of thread, other threads:[~2014-07-02 22:01 UTC | newest] Thread overview: 7+ messages (download: mbox.gz follow: Atom feed -- links below jump to the message on this page -- 2014-06-30 12:50 RFC: /proc/cpuinfo confusion with AMD processors Prarit Bhargava 2014-06-30 13:13 ` Borislav Petkov 2014-06-30 13:29 ` Prarit Bhargava 2014-06-30 13:38 ` Borislav Petkov 2014-06-30 14:07 ` Prarit Bhargava 2014-06-30 18:27 ` Borislav Petkov 2014-07-02 22:01 ` Pavel Machek
This is a public inbox, see mirroring instructions for how to clone and mirror all data and code used for this inbox