From mboxrd@z Thu Jan 1 00:00:00 1970 Received: from eggs.gnu.org ([2001:4830:134:3::10]:42683) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1VdIfi-0004Lu-Te for qemu-devel@nongnu.org; Mon, 04 Nov 2013 06:50:25 -0500 Received: from Debian-exim by eggs.gnu.org with spam-scanned (Exim 4.71) (envelope-from ) id 1VdIfc-0001As-8j for qemu-devel@nongnu.org; Mon, 04 Nov 2013 06:50:18 -0500 Received: from e06smtp11.uk.ibm.com ([195.75.94.107]:48862) by eggs.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1VdIfc-0001AY-00 for qemu-devel@nongnu.org; Mon, 04 Nov 2013 06:50:12 -0500 Received: from /spool/local by e06smtp11.uk.ibm.com with IBM ESMTP SMTP Gateway: Authorized Use Only! Violators will be prosecuted for from ; Mon, 4 Nov 2013 11:50:10 -0000 Date: Mon, 4 Nov 2013 12:50:04 +0100 From: Thomas Huth Message-ID: <20131104125004.666e84b7@oc7435384737.ibm.com> In-Reply-To: <777B6924-952E-4E09-B36C-9C36FC35252C@suse.de> References: <1383301276-29566-1-git-send-email-aik@ozlabs.ru> <00DBC3F8-D221-4E59-B0D0-38E1A4C015CC@suse.de> <1383562538.4776.52.camel@pasglop> <777B6924-952E-4E09-B36C-9C36FC35252C@suse.de> Mime-Version: 1.0 Content-Type: text/plain; charset=US-ASCII Content-Transfer-Encoding: 7bit Subject: Re: [Qemu-devel] [PATCH] spapr: make sure RMA is in first mode of first memory node List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , To: Alexander Graf Cc: Alexey Kardashevskiy , "list@suse.de:PReP" , QEMU Developers , Nikunj A Dadhania On Mon, 4 Nov 2013 12:28:12 +0100 Alexander Graf wrote: > > On 04.11.2013, at 11:55, Benjamin Herrenschmidt wrote: > > > On Mon, 2013-11-04 at 11:44 +0100, Alexander Graf wrote: > >> On 01.11.2013, at 11:21, Alexey Kardashevskiy wrote: > >> > >>> SLOF gets really confused if RTAS/device-tree and everything else > >>> what SLOF can use is not in the very first block of the very first > >>> memory node. > >>> > >>> This makes sure that the RMA area is where SLOF expects it to be. > >>> > >>> Cc: Benjamin Herrenschmidt > >>> Cc: Nikunj A Dadhania > >>> Signed-off-by: Alexey Kardashevskiy > >>> --- > >>> hw/ppc/spapr.c | 8 +++++++- > >>> 1 file changed, 7 insertions(+), 1 deletion(-) > >>> > >>> diff --git a/hw/ppc/spapr.c b/hw/ppc/spapr.c > >>> index 09dc635..09a5d94 100644 > >>> --- a/hw/ppc/spapr.c > >>> +++ b/hw/ppc/spapr.c > >>> @@ -1113,7 +1113,7 @@ static void ppc_spapr_init(QEMUMachineInitArgs *args) > >>> int i; > >>> MemoryRegion *sysmem = get_system_memory(); > >>> MemoryRegion *ram = g_new(MemoryRegion, 1); > >>> - hwaddr rma_alloc_size; > >>> + hwaddr rma_alloc_size, node0_size; > >>> uint32_t initrd_base = 0; > >>> long kernel_size = 0, initrd_size = 0; > >>> long load_limit, rtas_limit, fw_size; > >>> @@ -1154,6 +1154,12 @@ static void ppc_spapr_init(QEMUMachineInitArgs *args) > >>> spapr->rma_size = MIN(spapr->rma_size, 0x10000000); > >>> } > >>> } > >>> + /* > >>> + * SLOF gets confused if RMA resides not in the first block > >>> + * of the first memory node so let's fix it. > >>> + */ > >>> + node0_size = (nb_numa_nodes > 1) ? node_mem[0] : ram_size; > >>> + spapr->rma_size = MIN(spapr->rma_size, node0_size); > >> So if I create a NUMA node of 4MB that will be my RMA? That sounds pretty broken, especially on 970. > >> > >> Why does SLOF have any issues with NUMA memory nodes? It can just ignore them, no? > > > > Because the only way SLOF knows about the RMA is by using the first > > "reg" entry of the first memory node and that's *all* SLOF knows about. > > > > If we start putting things like the DT, SLOF itself, etc... outside of > > that region, it will crash. Ok, the question is whether this is a bug in SLOF and should be fixed there or whether the RMA should really be limited to the RAM of the first node only. Looking at the function spapr_populate_memory(), it seems there is already similar code there, so I assume the RMA should really be limited to that size: /* memory node(s) */ node0_size = (nb_numa_nodes > 1) ? node_mem[0] : ram_size; if (spapr->rma_size > node0_size) { spapr->rma_size = node0_size; } Maybe this piece of code could just be done earlier instead, before setting up the fdt_addr and rtas_addr variables, instead of adding the similar piece of code of this patch? > > So we "constrain" things to the rma that way. > > > > Creating 4M nodes makes no sense anyway > > So why don't we just use the "limit VRMA to 256MB" code always and error out of node0 is smaller? I don't think SLOF can run with less than 256MB anyway. It's 128 MB nowadays ... there is even a define called MIN_RMA_SLOF for this in the code already. Thomas