From mboxrd@z Thu Jan 1 00:00:00 1970 From: "Jan Beulich" Subject: Re: [PATCH, RFC] x86: make the GDT per-CPU Date: Thu, 11 Sep 2008 13:28:07 +0100 Message-ID: <48C92AF7.76E4.0078.0@novell.com> References: <48C7F75C.76E4.0078.0@novell.com> Mime-Version: 1.0 Content-Type: text/plain; charset=US-ASCII Content-Transfer-Encoding: quoted-printable Return-path: In-Reply-To: Content-Disposition: inline List-Unsubscribe: , List-Post: List-Help: List-Subscribe: , Sender: xen-devel-bounces@lists.xensource.com Errors-To: xen-devel-bounces@lists.xensource.com To: Keir Fraser Cc: xen-devel@lists.xensource.com List-Id: xen-devel@lists.xenproject.org >>> Keir Fraser 11.09.08 12:54 >>> >On 10/9/08 15:35, "Jan Beulich" wrote: > >> The major issue with supporting a significantly larger number of = physical >> CPUs appears to be the use of per-CPU GDT entries - at present, x86-64 >> could support only up to 126 CPUs (with code changes to also use the >> top-most GDT page, that would be 254). Instead of trying to go with >> incremental steps here, by converting the GDT itself to be per-CPU, >> limitations in that respect go away entirely. > >Two thoughts: > >Firstly, we don't really need the LDT and TSS GST slots to be always = valid. >Actually we always initialise the slot immediately before LTR or LLDT. So = we >could even have per-CPU LDT and TSS initialisation share a single slot. >Then, with the extra reserved page, we'd be good for nearly 512 CPUs. No, this would break 32-bits at least: The GDT entry for the selector loaded into TR must remain a valid, busy TSS descriptor for the whole lifetime of the system. So it can't be shared with the LDT. But even for 64-bits I would fear using the same GDT slot for both LDT and GDT loading. >Secondly: Actually your patch looks not too bad. But the double LGDT in >context switch is nasty. But also I do not see why it is necessary? >Presumably your fear is about using the prev->vcpu_id's mapped GDT in >next->vcpu_id's page tables? But we should only be relying on GDT entries >(HYPERVISOR_CS, HYPERVISOR_DS, for example) which are identical in all >per-CPU GDTs. So why do you need to add that LGDT before CR3 switch at = all? The goal is that the per-CPU descriptor be valid at all times (see the check_cpu() calls I put in there for debugging). As the double fault = handlers have no way of deriving the current processor other than from that GDT entry (actually, I think x86-64 could, but didn't so far, so I didn't = change that now), they'd break during that window. While you may argue that double faults are rare, my point here is that if we ever see one, = analyzing its dump shouldn't be made more difficult than it likely already will be. >You would need to use l1e_write_atomic() in the context-switch code, to = make >sure all VCPU's hypervisor reserved GDT mappings are always valid. = Actually >you must at least use l1e_write() in any case -- it is not safe to not = use >one of those macros on a live pagetable (by which I mean possibly in use = by >some CPU) because a direct write of a PAE pte is not atomic and can cause >the pte to pass through a bogus intermediate state (which could be = bogusly >prefetched by a CPU into its TLB. Yuk!). Ah, yes. l1e_write() should be sufficient, though, as the slot(s) that = get(s) written cannot be validly in use on any CPU (for other than speculation). Jan