xen-devel.lists.xenproject.org archive mirror
 help / color / mirror / Atom feed
From: Joao Martins <joao.m.martins@oracle.com>
To: Jan Beulich <JBeulich@suse.com>
Cc: Wei Liu <wei.liu2@citrix.com>,
	Ian Campbell <ian.campbell@citrix.com>,
	Stefano Stabellini <stefano.stabellini@eu.citrix.com>,
	Andrew Cooper <andrew.cooper3@citrix.com>,
	Ian Jackson <ian.jackson@eu.citrix.com>,
	xen-devel@lists.xen.org, Keir Fraser <keir@xen.org>
Subject: Re: [PATCH RFC 1/8] x86/hvm: set initial apicid to vcpu_id
Date: Wed, 2 Mar 2016 18:49:20 +0000	[thread overview]
Message-ID: <56D735B0.8040402@oracle.com> (raw)
In-Reply-To: <56CF41F302000078000D6595@prv-mh.provo.novell.com>

On 02/25/2016 05:03 PM, Jan Beulich wrote:
>>>> On 22.02.16 at 22:02, <joao.m.martins@oracle.com> wrote:
>> Currently the initial_apicid is set vcpu_id * 2 which makes it difficult
>> for the toolstack to manage how is the topology seen by the guest.
>> Instead of forcing procpkg and proccount to be VCPUID * 2, instead we
>> set it to max vcpuid on proccount to max_vcpu_id + 1 (logical number of
>> logical cores) and procpkg to max_vcpu_id (max cores minus 1)
> 
> I'm afraid it takes more than this to explain why the change is
> needed or at least desirable.
Apologies for my clumsiness in the commit message. I should have explained
properly why we need this for this series in the first place.

Currently use initial_apicid as vcpu_id * 2,
and doubled the leafs 1 and 4 values (proccount and procpkg) which means we will
address 8 LAPICIDs (tohugh only 4 will be used). Example topology and algorithm
below to facilitate discussion:

# Maximum logical addressable IDs (logical processors in a package)
proccount = CPUID.1:EBX[23:16]

# Maximum core addressable IDs - 1 (maximum cores in a package - 1)
procpkg = CPUID.(4,0):EAX[31:26]

# MSB (Calculate most significant bit)
SMT_Mask_width = MSB(proccount / (procpkg + 1))
Core_Mask_width = MSB(procpkg + 1)
CoreSMT_Mask_width = SMT_Mask_width + Core_Mask_width
Pkg_Mask_width = 1 << CoreSMT_Mask_width

SMT_ID = APICID & ((1 << SMT_Mask_width) - 1)
Core_ID = (APICID >> SMT_Mask_width) & ((1 << Core_Mask_width) - 1)
Pkg_ID = (APICID & Pkg_Mask_width) >> CoreSMT_Mask_width

So as it is right now, the topology on a 4 vcpu HVM guest looks like:

=> topology(proccount = 16, procpkg = 7) # current
APICID=0 SMT_ID=0 CORE_ID=0 PKG_ID=0 # VCPU 0
APICID=1 SMT_ID=1 CORE_ID=0 PKG_ID=0
APICID=2 SMT_ID=0 CORE_ID=1 PKG_ID=0 # VCPU 1
APICID=3 SMT_ID=1 CORE_ID=1 PKG_ID=0
APICID=4 SMT_ID=0 CORE_ID=2 PKG_ID=0 # VCPU 2
APICID=5 SMT_ID=1 CORE_ID=2 PKG_ID=0
APICID=6 SMT_ID=0 CORE_ID=3 PKG_ID=0 # VCPU 3
APICID=7 SMT_ID=1 CORE_ID=3 PKG_ID=0
[...]
APICID=14 SMT_ID=0 CORE_ID=7 PKG_ID=0
APICID=15 SMT_ID=1 CORE_ID=7 PKG_ID=0

As you know, APICID describes the SMT, Core and PKG. Problem with having APICID
in even numbers (0, 2, 4, ... N) is that we can't describe the SMT/siblings in
the topology. Thus turning the APICID ID space into (0, 1, 2 .. N) like this
patch proposes means we can know calculate all possibilities on both
topology kinds. Note that is a prerequisite patch so that a later
patch in this series sets the proccount and procpkg to enable seeing some cores
as SMT siblings.

=> topology(proccount = 4, procpkg = 3) # with this patch
APICID=0 SMT_ID=0 CORE_ID=0 PKG_ID=0
APICID=1 SMT_ID=0 CORE_ID=1 PKG_ID=0
APICID=2 SMT_ID=0 CORE_ID=2 PKG_ID=0
APICID=3 SMT_ID=0 CORE_ID=3 PKG_ID=0

x2APIC isn't addressed here for this RFC but it has the same issue (and
consequently exposure of FEATURE_XTOPOLOGY, CPUID.(EAX=0xB, ECX=N)). One
difference is that the SMT,Core,Pkg mask widths are fetched from each subleaf
directly as opposed to a calculation between procpkg and proccount.

> In particular I'd like to suggest that
> you do some archeology to understand why things are the way
> they are.
Digging in the history and threads, this behavior seems to be introduced by
commit c21d85b ("[HVM] Change VCPU->LAPIC_ID mapping so that VCPU0 has ID0")
where the main issue looked like a conflict between VCPU 0 LAPICID and IOAPIC
ID. Previous commits (a41ba62, facdf41) made IOAPIC on 0 and vLAPIC on 1..N but
it broke on old kernels (for the lack of LAPIC 0), so it ended up having a
vLAPIC ID space with 0, 2, 4, 6 and assign vIOAPICID = 1. This way all of
{L,IO}APICs have unique IDs - this thread
(http://lists.xen.org/archives/html/xen-devel/2008-09/msg00986.html) seems to
mention something along these lines too.

But the manuals aren't exactly clear on this ID uniqueness between LAPICs and
I/O APICs on more recent processors. Any lights on this would be great.

Intel 82093AA (IOAPIC) datasheet [0] says the following:

"This register contains the 4-bit APIC ID. The ID serves as a physical name of
the IOAPIC. All APIC devices using the APIC bus should have a unique APIC ID."

Though looking at the Intel SDM Volume 3, Chapter 10.1, Figure 10-2 and 10-3,
the APIC bus seems to be used only up to P6 family processors (Figure 10-2)
and it's indeed shared between I/OAPIC and LAPIC . For its successor (Pentium 4
and later) it's no longer the case (Figure 10-3).

My Broadwell machine in fact have conflicting APIC IDs between IOAPIC and LAPIC
in my MADT table. And it does seem that it's the case too for SeaBIOS (commit
e39b938 ("report real I/O APIC ID (0) on MADT and MP-table (v3)") ) and QEMU.
Though it wouldn't justify as reason for doing this on Xen.

[0] http://www.intel.com/design/chipsets/datashts/29056601.pdf

>> --- a/xen/arch/x86/hvm/hvm.c
>> +++ b/xen/arch/x86/hvm/hvm.c
>> @@ -4633,7 +4633,7 @@ void hvm_cpuid(unsigned int input, unsigned int *eax, 
>> unsigned int *ebx,
>>      case 0x1:
>>          /* Fix up VLAPIC details. */
>>          *ebx &= 0x00FFFFFFu;
>> -        *ebx |= (v->vcpu_id * 2) << 24;
>> +        *ebx |= (v->vcpu_id) << 24;
>>          if ( vlapic_hw_disabled(vcpu_vlapic(v)) )
>>              __clear_bit(X86_FEATURE_APIC & 31, edx);
> 
> In no case is this sufficient as adjustment to the hypervisor side,
> as it gets things out of sync with e.g. hvm/vlapic.c.
Yes, and also the firmware like this chunk below (tested too):

diff --git a/tools/firmware/hvmloader/config.h b/tools/firmware/hvmloader/config.h
index b838cf9..9c1f2f7 100644
--- a/tools/firmware/hvmloader/config.h
+++ b/tools/firmware/hvmloader/config.h
@@ -47,7 +47,7 @@ extern struct bios_config ovmf_config;
 #define IOAPIC_VERSION      0x11

 #define LAPIC_BASE_ADDRESS  0xfee00000
-#define LAPIC_ID(vcpu_id)   ((vcpu_id) * 2)
+#define LAPIC_ID(vcpu_id)   (vcpu_id)

 #define PCI_ISA_DEVFN       0x08    /* dev 1, fn 0 */
 #define PCI_ISA_IRQ_MASK    0x0c20U /* ISA IRQs 5,10,11 are PCI connected */
diff --git a/xen/arch/x86/hvm/vlapic.c b/xen/arch/x86/hvm/vlapic.c
index 01a8430..80fc6a1 100644
--- a/xen/arch/x86/hvm/vlapic.c
+++ b/xen/arch/x86/hvm/vlapic.c
@@ -1206,7 +1206,7 @@ void vlapic_reset(struct vlapic *vlapic)
     if ( !has_vlapic(v->domain) )
         return;

-    vlapic_set_reg(vlapic, APIC_ID,  (v->vcpu_id * 2) << 24);
+    vlapic_set_reg(vlapic, APIC_ID,  v->vcpu_id << 24);
     vlapic_set_reg(vlapic, APIC_LVR, VLAPIC_VERSION);

     for ( i = 0; i < 8; i++ )

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel

  reply	other threads:[~2016-03-02 18:49 UTC|newest]

Thread overview: 29+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2016-02-22 21:02 [PATCH RFC 0/8] x86/hvm, libxl: HVM SMT topology support Joao Martins
2016-02-22 21:02 ` [PATCH RFC 1/8] x86/hvm: set initial apicid to vcpu_id Joao Martins
2016-02-25 17:03   ` Jan Beulich
2016-03-02 18:49     ` Joao Martins [this message]
2016-02-22 21:02 ` [PATCH RFC 2/8] libxl: remove whitespace on libxl_types.idl Joao Martins
2016-02-25 16:28   ` Wei Liu
2016-03-02 19:14     ` Joao Martins
2016-02-22 21:02 ` [PATCH RFC 3/8] libxl: cpuid: add cache core count support Joao Martins
2016-02-22 21:02 ` [PATCH RFC 4/8] libxl: cpuid: add guest topology support Joao Martins
2016-02-25 16:29   ` Wei Liu
2016-03-02 19:14     ` Joao Martins
2016-02-22 21:02 ` [PATCH RFC 5/8] libxl: introduce smt field Joao Martins
2016-02-25 16:29   ` Wei Liu
2016-02-22 21:02 ` [PATCH RFC 6/8] xl: introduce smt option Joao Martins
2016-02-22 21:02 ` [PATCH RFC 7/8] libxl: introduce topology fields Joao Martins
2016-02-25 16:29   ` Wei Liu
2016-03-02 19:16     ` Joao Martins
2016-02-22 21:02 ` [PATCH RFC 8/8] xl: introduce topology options Joao Martins
2016-02-25 17:21 ` [PATCH RFC 0/8] x86/hvm, libxl: HVM SMT topology support Andrew Cooper
2016-02-26 15:03   ` Dario Faggioli
2016-02-26 15:27     ` Konrad Rzeszutek Wilk
2016-02-26 15:42       ` Dario Faggioli
2016-02-26 15:48         ` Andrew Cooper
2016-03-02 19:18   ` Joao Martins
2016-03-02 20:03     ` Andrew Cooper
2016-03-03  9:52       ` Joao Martins
2016-03-03 10:24         ` Andrew Cooper
2016-03-03 12:23           ` Joao Martins
2016-03-03 12:48             ` Andrew Cooper

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=56D735B0.8040402@oracle.com \
    --to=joao.m.martins@oracle.com \
    --cc=JBeulich@suse.com \
    --cc=andrew.cooper3@citrix.com \
    --cc=ian.campbell@citrix.com \
    --cc=ian.jackson@eu.citrix.com \
    --cc=keir@xen.org \
    --cc=stefano.stabellini@eu.citrix.com \
    --cc=wei.liu2@citrix.com \
    --cc=xen-devel@lists.xen.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).