From mboxrd@z Thu Jan 1 00:00:00 1970 From: Christoph Egger Subject: Re: [patch 4/4]Enable CMCI (Corrected Machine Check Error Interrupt) for Intel CPUs Date: Tue, 13 Jan 2009 12:21:22 +0100 Message-ID: <200901131221.23669.Christoph.Egger@amd.com> References: <200901121503.17232.Christoph.Egger@amd.com> Mime-Version: 1.0 Content-Type: text/plain; charset="iso-8859-15" Content-Transfer-Encoding: quoted-printable Return-path: In-Reply-To: Content-Disposition: inline List-Unsubscribe: , List-Post: List-Help: List-Subscribe: , Sender: xen-devel-bounces@lists.xensource.com Errors-To: xen-devel-bounces@lists.xensource.com To: "Ke, Liping" Cc: "xen-devel@lists.xensource.com" , Keir Fraser , "Jiang, Yunhong" List-Id: xen-devel@lists.xenproject.org On Tuesday 13 January 2009 03:12:51 Ke, Liping wrote: > Hi, Christoph > Please see our comments below > > Christoph Egger wrote: > > On Friday 19 December 2008 07:19:16 Ke, Liping wrote: > >> Hi, all > >> > >> Patch 4 is the main patch for CMCI enabling in XEN. It adds the CMCI > >> interrupt handler, new common CMCI/MCA init process,CMCI owner judge > >> algorithm when bring_up CPUs, CPU on/offline and polling > >> mechanisms, etc > >> > >> Thanks & Regards, > >> Criping > > > > This patch changes the public API. There's no need to change > > struct mcinfo_extended. The marshalling technique allows to use > > this structure as often as needed. Please undo this change. > > Since Intel extended msrs' number is bigger than AMD, and we can't use > pointer and allocate memory in mca handler, so we extended it from 5 -> 1= 0. > We think it should not impact any of your usage. > > Else, we can change boundary 5->0, use a globally allocated pointer > instead. But it seems not that necessary. How do you think about? When I defined the API, I already knew that 5 is not enough for Intel. So I made struct mc_info large enough to keep multiple entities in any combination. I expected from you to fill struct mcinfo_extended two or three times into struct mc_info the same way as you do with struct mcinfo_bank. There's no (additional) allocation needed. =46rom your description I just read, that you don't know this marshalling=20 technique. > > struct mcinfo_global is also changed. Why can't mc_coreid not be > > filled with the apicid ? Adding the apicid field breaks the alignment > > causing troubles on 32bit-guest-on-64bit-hypervisor. > > We plan to extend the apic_id to 32 bit since 8 bit is not enough for new > apic_id. Besides, for this problem, before sending the patch, we actually > talked about it. Seems original structure is not 32/64 alligned. How about > below changes? > > struct mcinfo_global { > struct mcinfo_common common; 4 byte > > /* running domain at the time in error (most likely the impacted one) > */ uint16_t mc_domid; 2 byte > uint32_t mc_socketid; /* physical socket of the physical core */ 4 by= te > uint16_t mc_coreid; /* physical impacted core */ 2 byte > uint8_t mc_apicid; ---change it to 4 byte > uint16_t mc_core_threadid; /* core thread of physical core */ 2 byte > uint16_t mc_vcpuid; /* virtual cpu scheduled for mc_domid */ 2 byte > uint64_t mc_gstatus; /* global status */ 8 byte > uint32_t mc_flags; 4 byte > }; > > > Change to > ------------------------>>>>> > > struct mcinfo_global { > struct mcinfo_common common; 4 byte > > /* running domain at the time in error (most likely the impacted one) > */ uint32_t mc_socketid; /* physical socket of the physical core */ 4 byte > ----------------------------------------------------------------- > uint16_t mc_domid; 2 byte > uint16_t mc_coreid; /* physical impacted core */ 2 byte > uint32_t mc_apicid; ---change it to 4 byte > ------------------------------------------------------------------- > uint16_t mc_core_threadid; /* core thread of physical core */ 2 byte > uint16_t mc_vcpuid; /* virtual cpu scheduled for mc_domid */ 2 byte > uint32_t mc_flags; 4 byte > -------------------------------------------------------------------------- > uint64_t mc_gstatus; /* global status */ 8 byte > }; Your proposed change fixes the alignment problem, but it doesn't answer my question: Why is mc_apicid needed at all since the CPU core is already identified by mc_coreid ? Christoph =2D-=20 Advanced Micro Devices GmbH Karl-Hammerschmidt-Str. 34 85609 Dornach b. M=FCnchen =20 Gesch=E4ftsf=FChrer: Jochen Polster, Thomas M. McCoy, Giuliano Meroni Sitz: Dornach, Gemeinde Aschheim, Landkreis M=FCnchen Registergericht M=FCnchen, HRB Nr. 43632