From mboxrd@z Thu Jan 1 00:00:00 1970 From: Steve Freitas Subject: Re: Regression, host crash with 4.5rc1 Date: Fri, 28 Nov 2014 00:24:56 -0800 Message-ID: <54783158.3050601@ihonk.com> References: <5457F79B.2020300@ihonk.com> <20141104082012.GY12451@reaktio.net> <5458B55C0200007800044B33@mail.emea.novell.com> <5460716A.7090905@ihonk.com> <54608A8B0200007800045E58@mail.emea.novell.com> <54611A86.4000200@ihonk.com> <5461D15C02000078000464D7@mail.emea.novell.com> <546A4AD4.3030002@ihonk.com> <546B094C0200007800048A5C@mail.emea.novell.com> <546D429A.5020906@ihonk.com> <546DAD6502000078000492FD@mail.emea.novell.com> <546E4A17.5040902@ihonk.com> <546F091F0200007800049A1C@smtp.nue.novell.com> <54713848.3020401@ihonk.com> <5472FE31020000780004A2D5@mail.emea.novell.com> <7637FB2C-D2F9-4F5A-B71F-6C444C7F1B71@ihonk.com> <54732768020000780004A48C@mail.emea.novell.com> <5473AE78.5070505@ihonk.com> <547448D7020000780004A919@mail.emea.novell.com> <54744E29.8060703@ihonk.com> <54746F59020000780004A9E9@mail.emea.novell.com> <5476B6A8.4060706@ihonk.com> <5476FC9C020000780004B11D@mail.emea.novell.com> Mime-Version: 1.0 Content-Type: text/plain; charset="us-ascii"; Format="flowed" Content-Transfer-Encoding: 7bit Return-path: In-Reply-To: <5476FC9C020000780004B11D@mail.emea.novell.com> List-Unsubscribe: , List-Post: List-Help: List-Subscribe: , Sender: xen-devel-bounces@lists.xen.org Errors-To: xen-devel-bounces@lists.xen.org To: Jan Beulich , Donald D Dugger , Jun Nakajima Cc: Don Slutz , "xen-devel@lists.xen.org" List-Id: xen-devel@lists.xenproject.org On 11/27/2014 01:27 AM, Jan Beulich wrote: > This was precisely the reason why I told you that the numbering > differs (and is confusing and has nothing to do with actual C state > numbers): What max_cstate refers to in the mwait-idle driver is > what above is listed as type[Cx], i.e. the state at index 1 is C1, at > 2 we've got C1E, and at 3 we've got C2. And those still aren't in > line with the numbering the CPU documentation uses, it's rather > kind of meant to refer to the ACPI numbering (but probably also > not fully matching up). Ah, thanks for the explanation. > So max_cstate=2 working suggests a problem with what the CPU > calls C6, which presumably isn't all that surprising considering the > many errata (BD35, BD38, BD40, BD59, BD87, and BD104). Not > sure how to proceed from here - I suppose you already made > sure you run with the latest available BIOS. Yes, latest available BIOS. > And with 6 errata > documented it's not all that unlikely that there's a 7th one with > MONITOR/MWAIT behavior. The commit you bisected to (and > which you had verified to be the culprit by just forcing > arch_skip_send_event_check() to always return false) could be > reasonably assumed to be broken only when MWAIT use for all > C states didn't work. Now I did get a hang with max_cstate=3 and mwait-idle=0. May I assume that mwait-idle=0 means that ACPI is responsible for the throttling? Thanks again for all your help! Steve