From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1752515AbcGMOGr (ORCPT ); Wed, 13 Jul 2016 10:06:47 -0400 Received: from mx0a-001b2d01.pphosted.com ([148.163.156.1]:38619 "EHLO mx0a-001b2d01.pphosted.com" rhost-flags-OK-OK-OK-FAIL) by vger.kernel.org with ESMTP id S1751315AbcGMOGj (ORCPT ); Wed, 13 Jul 2016 10:06:39 -0400 X-IBM-Helo: d01dlp01.pok.ibm.com X-IBM-MailFrom: paulmck@linux.vnet.ibm.com Date: Wed, 13 Jul 2016 07:06:18 -0700 From: "Paul E. McKenney" To: Ingo Molnar Cc: Peter Zijlstra , "H. Peter Anvin" , tglx@linutronix.de, mingo@elte.hu, ak@linux.intel.com, linux-kernel@vger.kernel.org Subject: Re: Odd performance results Reply-To: paulmck@linux.vnet.ibm.com References: <20160710042639.GA4068@linux.vnet.ibm.com> <7DF218CD-22F6-4E46-A628-2138AEA3A161@infradead.org> <20160710144327.GX4650@linux.vnet.ibm.com> <20160712145551.GU30909@twins.programming.kicks-ass.net> <20160712150529.GN7094@linux.vnet.ibm.com> <27d2c710-479d-77a9-f2c6-875e9c2bc40f@zytor.com> <20160712185120.GX30909@twins.programming.kicks-ass.net> <20160713071817.GC13006@gmail.com> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <20160713071817.GC13006@gmail.com> User-Agent: Mutt/1.5.21 (2010-09-15) X-TM-AS-MML: disable X-Content-Scanned: Fidelis XPS MAILER x-cbid: 16071314-0044-0000-0000-000000A4529C X-IBM-AV-DETECTION: SAVI=unused REMOTE=unused XFE=unused x-cbparentid: 16071314-0045-0000-0000-000004BA7114 Message-Id: <20160713140618.GE7094@linux.vnet.ibm.com> X-Proofpoint-Virus-Version: vendor=fsecure engine=2.50.10432:,, definitions=2016-07-13_04:,, signatures=0 X-Proofpoint-Spam-Details: rule=outbound_notspam policy=outbound score=0 spamscore=0 suspectscore=0 malwarescore=0 phishscore=0 adultscore=0 bulkscore=0 classifier=spam adjust=0 reason=mlx scancount=1 engine=8.0.1-1604210000 definitions=main-1607130158 Sender: linux-kernel-owner@vger.kernel.org List-ID: X-Mailing-List: linux-kernel@vger.kernel.org On Wed, Jul 13, 2016 at 09:18:17AM +0200, Ingo Molnar wrote: > > * Peter Zijlstra wrote: > > > On Tue, Jul 12, 2016 at 10:49:58AM -0700, H. Peter Anvin wrote: > > > On 07/12/16 08:05, Paul E. McKenney wrote: > > > The CPU in question (and /proc/cpuinfo should show this) has four cores > > > with a total of eight threads. The "siblings" and "cpu cores" fields in > > > /proc/cpuinfo should show the same thing. So I am utterly confused > > > about what is unexpected here? > > > > Typically threads are enumerated differently on Intel parts. Namely: > > > > cpu_id = core_id + nr_cores * smt_id > > Yeah, they are 'interleaved' at the thread/core level - I suppose to 'mix' them on > OS schedulers that don't know about SMT. > > (Fortunately this interleaving is not done across NUMA domains.) > > > $ cat /sys/devices/system/cpu/cpu*/topology/thread_siblings_list > > Btw., this command will print out the mappings in order even on larger systems and > shows the CPU # as well: > > $ grep -i . /sys/devices/system/cpu/cpu*/topology/thread_siblings_list | sort -t u -k +3 -n > > /sys/devices/system/cpu/cpu0/topology/thread_siblings_list:0,60 > /sys/devices/system/cpu/cpu1/topology/thread_siblings_list:1,61 > /sys/devices/system/cpu/cpu2/topology/thread_siblings_list:2,62 > /sys/devices/system/cpu/cpu3/topology/thread_siblings_list:3,63 > /sys/devices/system/cpu/cpu4/topology/thread_siblings_list:4,64 > /sys/devices/system/cpu/cpu5/topology/thread_siblings_list:5,65 > /sys/devices/system/cpu/cpu6/topology/thread_siblings_list:6,66 > /sys/devices/system/cpu/cpu7/topology/thread_siblings_list:7,67 > /sys/devices/system/cpu/cpu8/topology/thread_siblings_list:8,68 > /sys/devices/system/cpu/cpu9/topology/thread_siblings_list:9,69 > /sys/devices/system/cpu/cpu10/topology/thread_siblings_list:10,70 > /sys/devices/system/cpu/cpu11/topology/thread_siblings_list:11,71 > ... > /sys/devices/system/cpu/cpu116/topology/thread_siblings_list:56,116 > /sys/devices/system/cpu/cpu117/topology/thread_siblings_list:57,117 > /sys/devices/system/cpu/cpu118/topology/thread_siblings_list:58,118 > /sys/devices/system/cpu/cpu119/topology/thread_siblings_list:59,119 Here is what that gets me on the x86 test system I usually use: /sys/devices/system/cpu/cpu0/topology/thread_siblings_list:0,32 /sys/devices/system/cpu/cpu1/topology/thread_siblings_list:1,33 /sys/devices/system/cpu/cpu2/topology/thread_siblings_list:2,34 /sys/devices/system/cpu/cpu3/topology/thread_siblings_list:3,35 /sys/devices/system/cpu/cpu4/topology/thread_siblings_list:4,36 /sys/devices/system/cpu/cpu5/topology/thread_siblings_list:5,37 /sys/devices/system/cpu/cpu6/topology/thread_siblings_list:6,38 /sys/devices/system/cpu/cpu7/topology/thread_siblings_list:7,39 /sys/devices/system/cpu/cpu8/topology/thread_siblings_list:8,40 /sys/devices/system/cpu/cpu9/topology/thread_siblings_list:9,41 /sys/devices/system/cpu/cpu10/topology/thread_siblings_list:10,42 /sys/devices/system/cpu/cpu11/topology/thread_siblings_list:11,43 [ . . . ] /sys/devices/system/cpu/cpu56/topology/thread_siblings_list:24,56 /sys/devices/system/cpu/cpu57/topology/thread_siblings_list:25,57 /sys/devices/system/cpu/cpu58/topology/thread_siblings_list:26,58 /sys/devices/system/cpu/cpu59/topology/thread_siblings_list:27,59 /sys/devices/system/cpu/cpu60/topology/thread_siblings_list:28,60 /sys/devices/system/cpu/cpu61/topology/thread_siblings_list:29,61 /sys/devices/system/cpu/cpu62/topology/thread_siblings_list:30,62 /sys/devices/system/cpu/cpu63/topology/thread_siblings_list:31,63 On my laptop: /sys/devices/system/cpu/cpu0/topology/thread_siblings_list:0-1 /sys/devices/system/cpu/cpu1/topology/thread_siblings_list:0-1 /sys/devices/system/cpu/cpu2/topology/thread_siblings_list:2-3 /sys/devices/system/cpu/cpu3/topology/thread_siblings_list:2-3 /sys/devices/system/cpu/cpu4/topology/thread_siblings_list:4-5 /sys/devices/system/cpu/cpu5/topology/thread_siblings_list:4-5 /sys/devices/system/cpu/cpu6/topology/thread_siblings_list:6-7 /sys/devices/system/cpu/cpu7/topology/thread_siblings_list:6-7 > > The ordering Paul has, namely 0,1 for core0,smt{0,1} is not something > > I've ever seen on an Intel part. AMD otoh does enumerate their CMT stuff > > like what Paul has. > > That's more the natural 'direct' mapping from CPU internal topology to CPU id: > what's close to each other physically is close to each other in the CPU id space > as well. Agreed! Thanx, Paul