From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from mail.linutronix.de (146.0.238.70:993) by crypto-ml.lab.linutronix.de with IMAP4-SSL for ; 28 Jun 2018 22:19:08 -0000 Received: from smtp.ctxuk.citrix.com ([185.25.65.24] helo=SMTP.EU.CITRIX.COM) by Galois.linutronix.de with esmtps (TLS1.2:DHE_RSA_AES_256_CBC_SHA256:256) (Exim 4.80) (envelope-from ) id 1fYfFq-0006j1-Np for speck@linutronix.de; Fri, 29 Jun 2018 00:19:07 +0200 Subject: [MODERATED] Re: [patch V4 13/13] x86/apic: Ignore secondary threads if nosmt=force References: <20180620201907.304694346@linutronix.de> <20180620201933.588840902@linutronix.de> <9a4cea33-05a7-b6d9-7f49-692603bd047f@linux.intel.com> From: Andrew Cooper Message-ID: <1bca3675-86eb-15df-bcfe-2ee36167f15c@citrix.com> Date: Thu, 28 Jun 2018 23:19:00 +0100 MIME-Version: 1.0 In-Reply-To: <9a4cea33-05a7-b6d9-7f49-692603bd047f@linux.intel.com> Content-Type: multipart/mixed; boundary="GsBtuMXPA7SPQqTUcRxNZG77nkSaNP1Z4"; protected-headers="v1" To: speck@linutronix.de List-ID: --GsBtuMXPA7SPQqTUcRxNZG77nkSaNP1Z4 Content-Type: text/plain; charset=utf-8 Content-Transfer-Encoding: quoted-printable Content-Language: en-GB On 28/06/2018 23:13, speck for Dave Hansen wrote: > On 06/20/2018 01:19 PM, speck for Thomas Gleixner wrote: >> + /* >> + * If SMT is force disabled and the APIC ID belongs to >> + * a secondary thread, ignore it. >> + */ >> + if (apic_id_disabled(apicid)) { >> + pr_info_once("Ignoring secondary SMT threads\n"); >> + return -EINVAL; >> + } > Thomas, this boottime-disable stuff just ends up ignoring the > hyperthread and leaves it alone, right? Yes > > Some Intel folks pointed out a few problems with this. One is with > machine checks. If one thread is booted and has CR4.MCE=3D1, but the > other never gets booted and still has CR4.MCE=3D0, things go boom > (everything goes to shutdown state) if a machine check happens. > > We've traditionally pretended that this does not happen because folks > don't tend to turn off CPUs they've paid for via things like maxcpus=3D= in > the real world. > > Ashok Raj and Tony Luck were evidently looking at this at some point, > but it got tricky and decided it wasn't worth the trouble. > > It makes me think we should either scrap or recommend against "nosmt=3D= force". > > Some relevant SDM language: > >> Because the logical processors within a physical package are tightly >> coupled with respect to shared hardware resources, both logical >> processors are notified of machine check errors that occur within a >> given physical processor. If machine-check exceptions are enabled >> when a fatal error is reported, all the logical processors within a >> physical package are dispatched to the machine-check exception >> handler. If machine-check exceptions are disabled, the logical >> processors enter the shutdown state and assert the IERR# signal. When >> enabling machine-check exceptions, the MCE flag in control register >> CR4 should be set for each logical processor. So what you're saying is that we need to boot all the threads, including MCE setup etc, then leave them alone (mwait/deep C states?) so they avoid causing a shutdown? If so, I've got quite a lot of extra work to do in Xen... ~Andrew --GsBtuMXPA7SPQqTUcRxNZG77nkSaNP1Z4--