From mboxrd@z Thu Jan 1 00:00:00 1970 Received: from mailman by lists.gnu.org with tmda-scanned (Exim 4.43) id 1LAjjU-0006Le-Vg for qemu-devel@nongnu.org; Thu, 11 Dec 2008 06:29:29 -0500 Received: from exim by lists.gnu.org with spam-scanned (Exim 4.43) id 1LAjjS-0006KL-63 for qemu-devel@nongnu.org; Thu, 11 Dec 2008 06:29:28 -0500 Received: from [199.232.76.173] (port=52638 helo=monty-python.gnu.org) by lists.gnu.org with esmtp (Exim 4.43) id 1LAjjR-0006KE-W1 for qemu-devel@nongnu.org; Thu, 11 Dec 2008 06:29:26 -0500 Received: from outbound-dub.frontbridge.com ([213.199.154.16]:38679 helo=IE1EHSOBE005.bigfish.com) by monty-python.gnu.org with esmtps (TLS-1.0:RSA_ARCFOUR_MD5:16) (Exim 4.60) (envelope-from ) id 1LAjjR-0006GD-59 for qemu-devel@nongnu.org; Thu, 11 Dec 2008 06:29:25 -0500 Message-ID: <4940F9B5.9080206@amd.com> Date: Thu, 11 Dec 2008 12:29:57 +0100 From: Andre Przywara MIME-Version: 1.0 Content-Type: multipart/mixed; boundary="------------030507010506000508040102" Subject: [Qemu-devel] [PATCH 2/3] NUMA: promoting NUMA topology to BIOS and pin guest memory Reply-To: qemu-devel@nongnu.org List-Id: qemu-devel.nongnu.org List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , To: Anthony Liguori Cc: qemu-devel@nongnu.org, Avi Kivity --------------030507010506000508040102 Content-Type: text/plain; charset="ISO-8859-1"; format=flowed Content-Transfer-Encoding: 7bit This patch pushes the parsed NUMA topology via the firmware configuration interface to the BIOS and pins the guest memory (if desired). Signed-off-by: Andre Przywara -- Andre Przywara AMD-Operating System Research Center (OSRC), Dresden, Germany Tel: +49 351 277-84917 ----to satisfy European Law for business letters: AMD Saxony Limited Liability Company & Co. KG, Wilschdorfer Landstr. 101, 01109 Dresden, Germany Register Court Dresden: HRA 4896, General Partner authorized to represent: AMD Saxony LLC (Wilmington, Delaware, US) General Manager of AMD Saxony LLC: Dr. Hans-R. Deppe, Thomas McCoy --------------030507010506000508040102 Content-Type: text/x-patch; name="qemunuma_hostalloc.patch" Content-Transfer-Encoding: 7bit Content-Disposition: inline; filename="qemunuma_hostalloc.patch" # HG changeset patch # User Andre Przywara # Date 1228992161 -3600 # Node ID 0501b7490a00ef7a77e69f846d332f797162052a # Parent 394d02758aa4358be3bcd14f9d59efaf42e89328 promoting NUMA topology to BIOS and pin guest memory diff -r 394d02758aa4 -r 0501b7490a00 Makefile.target --- a/Makefile.target Thu Dec 11 11:36:21 2008 +0100 +++ b/Makefile.target Thu Dec 11 11:42:41 2008 +0100 @@ -600,6 +600,10 @@ endif endif ifdef CONFIG_CS4231A SOUND_HW += cs4231a.o +endif + +ifdef CONFIG_NUMA +LIBS += -lnuma endif ifdef CONFIG_VNC_TLS diff -r 394d02758aa4 -r 0501b7490a00 configure --- a/configure Thu Dec 11 11:36:21 2008 +0100 +++ b/configure Thu Dec 11 11:42:41 2008 +0100 @@ -368,6 +368,8 @@ for opt do ;; --enable-mixemu) mixemu="yes" ;; + --disable-numa) numa="no" + ;; --disable-aio) aio="no" ;; --disable-blobs) blobs="no" @@ -462,6 +464,7 @@ echo " --audio-card-list=LIST set lis echo " --audio-card-list=LIST set list of additional emulated audio cards" echo " Available cards: ac97 adlib cs4231a gus" echo " --enable-mixemu enable mixer emulation" +echo " --disable-numa disable NUMA support (host side)" echo " --disable-brlapi disable BrlAPI" echo " --disable-vnc-tls disable TLS encryption for VNC server" echo " --disable-curses disable curses output" @@ -877,6 +880,21 @@ done done ########################################## +# libnuma probe + +if test -z "$numa" ; then + numa=no + + cat > $TMPC << EOF +#include +int main(void) { return numa_available(); } +EOF + if $cc ${ARCH_CFLAGS} -o $TMPE ${OS_CFLAGS} $TMPC -lnuma > /dev/null 2> /dev/null ; then + numa=yes + fi +fi + +########################################## # BrlAPI probe if test -z "$brlapi" ; then @@ -1033,6 +1051,7 @@ echo "Audio drivers $audio_drv_list" echo "Audio drivers $audio_drv_list" echo "Extra audio cards $audio_card_list" echo "Mixer emulation $mixemu" +echo "NUMA support $numa" echo "VNC TLS support $vnc_tls" if test "$vnc_tls" = "yes" ; then echo " TLS CFLAGS $vnc_tls_cflags" @@ -1272,6 +1291,10 @@ if test "$mixemu" = "yes" ; then if test "$mixemu" = "yes" ; then echo "CONFIG_MIXEMU=yes" >> $config_mak echo "#define CONFIG_MIXEMU 1" >> $config_h +fi +if test "$numa" = "yes" ; then + echo "CONFIG_NUMA=yes" >> $config_mak + echo "#define CONFIG_NUMA 1" >> $config_h fi if test "$vnc_tls" = "yes" ; then echo "CONFIG_VNC_TLS=yes" >> $config_mak diff -r 394d02758aa4 -r 0501b7490a00 hw/fw_cfg.h --- a/hw/fw_cfg.h Thu Dec 11 11:36:21 2008 +0100 +++ b/hw/fw_cfg.h Thu Dec 11 11:42:41 2008 +0100 @@ -8,6 +8,9 @@ #define FW_CFG_NOGRAPHIC 0x04 #define FW_CFG_NB_CPUS 0x05 #define FW_CFG_MACHINE_ID 0x06 +#define FW_CFG_NUMA_NODES 0x07 +#define FW_CFG_NUMA_NODE_CPUS 0x08 +#define FW_CFG_NUMA_NODE_MEM 0x09 #define FW_CFG_MAX_ENTRY 0x10 #define FW_CFG_WRITE_CHANNEL 0x4000 diff -r 394d02758aa4 -r 0501b7490a00 hw/pc.c --- a/hw/pc.c Thu Dec 11 11:36:21 2008 +0100 +++ b/hw/pc.c Thu Dec 11 11:42:41 2008 +0100 @@ -436,6 +436,12 @@ static void bochs_bios_init(void) fw_cfg = fw_cfg_init(BIOS_CFG_IOPORT, BIOS_CFG_IOPORT + 1, 0, 0); fw_cfg_add_i32(fw_cfg, FW_CFG_ID, 1); fw_cfg_add_i64(fw_cfg, FW_CFG_RAM_SIZE, (uint64_t)ram_size); + fw_cfg_add_i16(fw_cfg, FW_CFG_NUMA_NODES, numnumanodes); + + fw_cfg_add_bytes(fw_cfg, FW_CFG_NUMA_NODE_MEM, (uint8_t*)node_mem, + sizeof(node_mem[0]) * numnumanodes); + fw_cfg_add_bytes(fw_cfg, FW_CFG_NUMA_NODE_CPUS, (uint8_t*)node_to_cpus, + sizeof(node_to_cpus[0]) * numnumanodes); } /* Generate an initial boot sector which sets state and jump to diff -r 394d02758aa4 -r 0501b7490a00 vl.c --- a/vl.c Thu Dec 11 11:36:21 2008 +0100 +++ b/vl.c Thu Dec 11 11:42:41 2008 +0100 @@ -93,6 +93,9 @@ #include #include +#ifdef CONFIG_NUMA +#include +#endif #endif #ifdef __sun__ #include @@ -5449,6 +5452,21 @@ int main(int argc, char **argv, char **e exit(1); } +#ifdef CONFIG_NUMA + if (numnumanodes > 0 && numa_available() != -1) { + unsigned long offset = 0; + int i; + + for (i = 0; i < numnumanodes; ++i) { + if (hostnodes[i] != (uint64_t)-1) { + numa_tonode_memory (phys_ram_base + offset, node_mem[i], + hostnodes[i] % (numa_max_node() + 1)); + } + offset += node_mem[i]; + } + } +#endif + /* init the dynamic translator */ cpu_exec_init_all(tb_size * 1024 * 1024); --------------030507010506000508040102--