From mboxrd@z Thu Jan 1 00:00:00 1970 Received: from eggs.gnu.org ([2001:4830:134:3::10]:47458) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1gUvX3-0006Gx-VZ for qemu-devel@nongnu.org; Thu, 06 Dec 2018 10:25:49 -0500 Received: from Debian-exim by eggs.gnu.org with spam-scanned (Exim 4.71) (envelope-from ) id 1gUvMe-0003CW-Sv for qemu-devel@nongnu.org; Thu, 06 Dec 2018 10:15:09 -0500 Received: from 1.mo177.mail-out.ovh.net ([178.33.107.143]:56343) by eggs.gnu.org with esmtps (TLS1.0:DHE_RSA_AES_256_CBC_SHA1:32) (Exim 4.71) (envelope-from ) id 1gUvMe-0003Aa-8A for qemu-devel@nongnu.org; Thu, 06 Dec 2018 10:14:56 -0500 Received: from player688.ha.ovh.net (unknown [10.109.143.109]) by mo177.mail-out.ovh.net (Postfix) with ESMTP id 361C8D6D75 for ; Thu, 6 Dec 2018 16:14:54 +0100 (CET) From: =?UTF-8?Q?C=c3=a9dric_Le_Goater?= References: <20181116105729.23240-1-clg@kaod.org> <20181116105729.23240-37-clg@kaod.org> <20181203022653.GS30479@umbus.fritz.box> Message-ID: <1fa0e7d4-399d-02b3-1d06-118a6e54d959@kaod.org> Date: Thu, 6 Dec 2018 16:14:48 +0100 MIME-Version: 1.0 In-Reply-To: <20181203022653.GS30479@umbus.fritz.box> Content-Type: text/plain; charset=windows-1252 Content-Language: en-US Content-Transfer-Encoding: quoted-printable Subject: Re: [Qemu-devel] [PATCH v5 36/36] ppc/pnv: add XIVE support List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , To: David Gibson Cc: qemu-ppc@nongnu.org, qemu-devel@nongnu.org, Benjamin Herrenschmidt On 12/3/18 3:26 AM, David Gibson wrote: > On Fri, Nov 16, 2018 at 11:57:29AM +0100, C=E9dric Le Goater wrote: >> This is simple model of the POWER9 XIVE interrupt controller for the >> PowerNV machine. XIVE for baremetal is a complex controller and the >> model only addresses the needs of the skiboot firmware. >> >> * Overall architecture >> >> XIVE Interrupt Controller >> +-------------------------------------+ IPIs >> | +---------+ +---------+ +---------+ | +--------+ >> | |VC | |CQ | |PC |----> | CORES | >> | | esb | | | | |----> | | >> | | eas | | Bridge | | |----> | | >> | |SC end | | | | nvt | | | | >> +------+ | +---------+ +----+----+ +---------+ | +--+-+-+-+ >> | RAM | +------------------|------------------+ | | | >> | | | | | | >> | | | | | | >> | | +---------------------v--------------------------v-v-v---+ = other >> | <---+ Power Bus +-= ---> chips >> | esb | +-----------+-----------------------+--------------------+ >> | eas | | | >> | end | | | >> | nvt | +---+----+ +---+----+ >> +------+ |SC | |SC | >> | | | | >> | 2-bits | | 2-bits | >> | local | | VC | >> +--------+ +--------+ >> PCIe NX,NPU,CAPI >> >> SC: Source Controller (aka. IVSE) >> VC: Virtualization Controller (aka. IVRE) >> CQ: Common Queue (Bridge) >> PC: Presentation Controller (aka. IVPE) >> >> 2-bits: source state machine >> esb: Event State Buffer (Array of PQ bits in an IVSE) >> eas: Event Assignment Structure >> end: Event Notification Descriptor >> nvt: Notification Virtual Target >> >> It is composed of three sub-engines : >> >> - Interrupt Virtualization Source Engine (IVSE), or Source >> Controller (SC). These are found in PCI PHBs, in the PSI host >> bridge controller, but also inside the main controller for the >> core IPIs and other sub-chips (NX, CAP, NPU) of the >> chip/processor. They are configured to feed the IVRE with events. >> >> - Interrupt Virtualization Routing Engine (IVRE) or Virtualization >> Controller (VC). Its job is to match an event source with an Event >> Notification Descriptor (END). >> >> - Interrupt Virtualization Presentation Engine (IVPE) or Presentatio= n >> Controller (PC). It maintains the interrupt context state of each >> thread and handles the delivery of the external exception to the >> thread. >> >> * XIVE internal tables >> >> Each of the sub-engines uses a set of tables to redirect exceptions >> from event sources to CPU threads. >> >> +-------+ >> User or OS | EQ | >> or +------>|entries| >> Hypervisor | | .. | >> Memory | +-------+ >> | ^ >> | | >> +--------------------------------------------------+ >> | | >> Hypervisor +------+ +---+--+ +---+--+ +------+ >> Memory | ESB | | EAT | | ENDT | | NVTT | >> (skiboot) +----+-+ +----+-+ +----+-+ +------+ >> ^ | ^ | ^ | ^ >> | | | | | | | >> +--------------------------------------------------+ >> | | | | | | | >> | | | | | | | >> +-----|--|--------|--|--------|--|-+ +-|-----+ += ------+ >> | | | | | | | | | | tctx| |= Thread| >> IPI or ----+ + v + v + v |---| + .. |-----= > | >> HW events | | | | |= | >> | IVRE | | IVPE | += ------+ >> +----------------------------------+ +-------+ >> >> The IVSE have a 2-bits, P for pending and Q for queued, state machine >> for each source that allows events to be triggered. They are stored in >> an array, the Event State Buffer (ESB) and controlled by MMIOs. >> >> If the event is let through, the IVRE looks up in the Event Assignment >> Structure (EAS) table for an Event Notification Descriptor (END) >> configured for the source. Each Event Notification Descriptor defines >> a notification path to a CPU and an in-memory Event Queue, in which >> will be pushed an EQ data for the OS to pull. >> >> The IVPE determines if a Notification Virtual Target (NVT) can handle >> the event by scanning the thread contexts of the VPs dispatched on the >> processor HW threads. It maintains the interrupt context state of each >> thread in a NVT table. >> >> * QEMU model for PowerNV >> >> The PowerNV model reuses the common XIVE framework developed for sPAPR >> and the fundamentals aspects are quite the same. The difference are >> outlined below. >> >> The controller initial BAR configuration is performed using the XSCOM >> bus from there, MMIO are used for further configuration. >> >> The MMIO regions exposed are : >> >> - Interrupt controller registers >> - ESB pages for IPIs and ENDs >> - Presenter MMIO (Not used) >> - Thread Interrupt Management Area MMIO, direct and indirect >> >> Virtualization Controller MMIO region containing the IPI ESB pages and >> END ESB pages is sub-divided into "sets" which map portions of the VC >> region to the different ESB pages. It is configured at runtime through >> the EDT set translation table to let the firmware decide how to split >> the address space between IPI ESB pages and END ESB pages. >> >> The XIVE tables are now in the machine RAM and not in the hypervisor >> anymore. The firmware (skiboot) configures these tables using Virtual >> Structure Descriptor defining the characteristics of each table : SBE, >> EAS, END and NVT. These are later used to access the virtual interrupt >> entries. The internal cache of these tables in the interrupt controlle= r >> is updated and invalidated using a set of registers. >> >> Signed-off-by: C=E9dric Le Goater >> --- >> hw/intc/pnv_xive_regs.h | 314 +++++++ >> include/hw/ppc/pnv.h | 22 +- >> include/hw/ppc/pnv_xive.h | 100 +++ >> include/hw/ppc/pnv_xscom.h | 3 + >> include/hw/ppc/xive.h | 1 + >> hw/intc/pnv_xive.c | 1612 +++++++++++++++++++++++++++++++++++= + >> hw/intc/xive.c | 63 +- >> hw/ppc/pnv.c | 58 +- >> hw/intc/Makefile.objs | 2 +- >> 9 files changed, 2164 insertions(+), 11 deletions(-) >> create mode 100644 hw/intc/pnv_xive_regs.h >> create mode 100644 include/hw/ppc/pnv_xive.h >> create mode 100644 hw/intc/pnv_xive.c >> >> diff --git a/hw/intc/pnv_xive_regs.h b/hw/intc/pnv_xive_regs.h >> new file mode 100644 >> index 000000000000..509d5a18cdde >> --- /dev/null >> +++ b/hw/intc/pnv_xive_regs.h >> @@ -0,0 +1,314 @@ >> +/* >> + * QEMU PowerPC XIVE interrupt controller model >> + * >> + * Copyright (c) 2017-2018, IBM Corporation. >> + * >> + * This code is licensed under the GPL version 2 or later. See the >> + * COPYING file in the top-level directory. >> + */ >> + >> +#ifndef PPC_PNV_XIVE_REGS_H >> +#define PPC_PNV_XIVE_REGS_H >> + >> +/* IC register offsets 0x0 - 0x400 */ >> +#define CQ_SWI_CMD_HIST 0x020 >> +#define CQ_SWI_CMD_POLL 0x028 >> +#define CQ_SWI_CMD_BCAST 0x030 >> +#define CQ_SWI_CMD_ASSIGN 0x038 >> +#define CQ_SWI_CMD_BLK_UPD 0x040 >> +#define CQ_SWI_RSP 0x048 >> +#define X_CQ_CFG_PB_GEN 0x0a >> +#define CQ_CFG_PB_GEN 0x050 >> +#define CQ_INT_ADDR_OPT PPC_BITMASK(14, 15) >> +#define X_CQ_IC_BAR 0x10 >> +#define X_CQ_MSGSND 0x0b >> +#define CQ_MSGSND 0x058 >> +#define CQ_CNPM_SEL 0x078 >> +#define CQ_IC_BAR 0x080 >> +#define CQ_IC_BAR_VALID PPC_BIT(0) >> +#define CQ_IC_BAR_64K PPC_BIT(1) >> +#define X_CQ_TM1_BAR 0x12 >> +#define CQ_TM1_BAR 0x90 >> +#define X_CQ_TM2_BAR 0x014 >> +#define CQ_TM2_BAR 0x0a0 >> +#define CQ_TM_BAR_VALID PPC_BIT(0) >> +#define CQ_TM_BAR_64K PPC_BIT(1) >> +#define X_CQ_PC_BAR 0x16 >> +#define CQ_PC_BAR 0x0b0 >> +#define CQ_PC_BAR_VALID PPC_BIT(0) >> +#define X_CQ_PC_BARM 0x17 >> +#define CQ_PC_BARM 0x0b8 >> +#define CQ_PC_BARM_MASK PPC_BITMASK(26, 38) >> +#define X_CQ_VC_BAR 0x18 >> +#define CQ_VC_BAR 0x0c0 >> +#define CQ_VC_BAR_VALID PPC_BIT(0) >> +#define X_CQ_VC_BARM 0x19 >> +#define CQ_VC_BARM 0x0c8 >> +#define CQ_VC_BARM_MASK PPC_BITMASK(21, 37) >> +#define X_CQ_TAR 0x1e >> +#define CQ_TAR 0x0f0 >> +#define CQ_TAR_TBL_AUTOINC PPC_BIT(0) >> +#define CQ_TAR_TSEL PPC_BITMASK(12, 15) >> +#define CQ_TAR_TSEL_BLK PPC_BIT(12) >> +#define CQ_TAR_TSEL_MIG PPC_BIT(13) >> +#define CQ_TAR_TSEL_VDT PPC_BIT(14) >> +#define CQ_TAR_TSEL_EDT PPC_BIT(15) >> +#define CQ_TAR_TSEL_INDEX PPC_BITMASK(26, 31) >> +#define X_CQ_TDR 0x1f >> +#define CQ_TDR 0x0f8 >> +#define CQ_TDR_VDT_VALID PPC_BIT(0) >> +#define CQ_TDR_VDT_BLK PPC_BITMASK(11, 15) >> +#define CQ_TDR_VDT_INDEX PPC_BITMASK(28, 31) >> +#define CQ_TDR_EDT_TYPE PPC_BITMASK(0, 1) >> +#define CQ_TDR_EDT_INVALID 0 >> +#define CQ_TDR_EDT_IPI 1 >> +#define CQ_TDR_EDT_EQ 2 >> +#define CQ_TDR_EDT_BLK PPC_BITMASK(12, 15) >> +#define CQ_TDR_EDT_INDEX PPC_BITMASK(26, 31) >> +#define X_CQ_PBI_CTL 0x20 >> +#define CQ_PBI_CTL 0x100 >> +#define CQ_PBI_PC_64K PPC_BIT(5) >> +#define CQ_PBI_VC_64K PPC_BIT(6) >> +#define CQ_PBI_LNX_TRIG PPC_BIT(7) >> +#define CQ_PBI_FORCE_TM_LOCAL PPC_BIT(22) >> +#define CQ_PBO_CTL 0x108 >> +#define CQ_AIB_CTL 0x110 >> +#define X_CQ_RST_CTL 0x23 >> +#define CQ_RST_CTL 0x118 >> +#define X_CQ_FIRMASK 0x33 >> +#define CQ_FIRMASK 0x198 >> +#define X_CQ_FIRMASK_AND 0x34 >> +#define CQ_FIRMASK_AND 0x1a0 >> +#define X_CQ_FIRMASK_OR 0x35 >> +#define CQ_FIRMASK_OR 0x1a8 >> + >> +/* PC LBS1 register offsets 0x400 - 0x800 */ >> +#define X_PC_TCTXT_CFG 0x100 >> +#define PC_TCTXT_CFG 0x400 >> +#define PC_TCTXT_CFG_BLKGRP_EN PPC_BIT(0) >> +#define PC_TCTXT_CFG_TARGET_EN PPC_BIT(1) >> +#define PC_TCTXT_CFG_LGS_EN PPC_BIT(2) >> +#define PC_TCTXT_CFG_STORE_ACK PPC_BIT(3) >> +#define PC_TCTXT_CFG_HARD_CHIPID_BLK PPC_BIT(8) >> +#define PC_TCTXT_CHIPID_OVERRIDE PPC_BIT(9) >> +#define PC_TCTXT_CHIPID PPC_BITMASK(12, 15) >> +#define PC_TCTXT_INIT_AGE PPC_BITMASK(30, 31) >> +#define X_PC_TCTXT_TRACK 0x101 >> +#define PC_TCTXT_TRACK 0x408 >> +#define PC_TCTXT_TRACK_EN PPC_BIT(0) >> +#define X_PC_TCTXT_INDIR0 0x104 >> +#define PC_TCTXT_INDIR0 0x420 >> +#define PC_TCTXT_INDIR_VALID PPC_BIT(0) >> +#define PC_TCTXT_INDIR_THRDID PPC_BITMASK(9, 15) >> +#define X_PC_TCTXT_INDIR1 0x105 >> +#define PC_TCTXT_INDIR1 0x428 >> +#define X_PC_TCTXT_INDIR2 0x106 >> +#define PC_TCTXT_INDIR2 0x430 >> +#define X_PC_TCTXT_INDIR3 0x107 >> +#define PC_TCTXT_INDIR3 0x438 >> +#define X_PC_THREAD_EN_REG0 0x108 >> +#define PC_THREAD_EN_REG0 0x440 >> +#define X_PC_THREAD_EN_REG0_SET 0x109 >> +#define PC_THREAD_EN_REG0_SET 0x448 >> +#define X_PC_THREAD_EN_REG0_CLR 0x10a >> +#define PC_THREAD_EN_REG0_CLR 0x450 >> +#define X_PC_THREAD_EN_REG1 0x10c >> +#define PC_THREAD_EN_REG1 0x460 >> +#define X_PC_THREAD_EN_REG1_SET 0x10d >> +#define PC_THREAD_EN_REG1_SET 0x468 >> +#define X_PC_THREAD_EN_REG1_CLR 0x10e >> +#define PC_THREAD_EN_REG1_CLR 0x470 >> +#define X_PC_GLOBAL_CONFIG 0x110 >> +#define PC_GLOBAL_CONFIG 0x480 >> +#define PC_GCONF_INDIRECT PPC_BIT(32) >> +#define PC_GCONF_CHIPID_OVR PPC_BIT(40) >> +#define PC_GCONF_CHIPID PPC_BITMASK(44, 47) >> +#define X_PC_VSD_TABLE_ADDR 0x111 >> +#define PC_VSD_TABLE_ADDR 0x488 >> +#define X_PC_VSD_TABLE_DATA 0x112 >> +#define PC_VSD_TABLE_DATA 0x490 >> +#define X_PC_AT_KILL 0x116 >> +#define PC_AT_KILL 0x4b0 >> +#define PC_AT_KILL_VALID PPC_BIT(0) >> +#define PC_AT_KILL_BLOCK_ID PPC_BITMASK(27, 31) >> +#define PC_AT_KILL_OFFSET PPC_BITMASK(48, 60) >> +#define X_PC_AT_KILL_MASK 0x117 >> +#define PC_AT_KILL_MASK 0x4b8 >> + >> +/* PC LBS2 register offsets */ >> +#define X_PC_VPC_CACHE_ENABLE 0x161 >> +#define PC_VPC_CACHE_ENABLE 0x708 >> +#define PC_VPC_CACHE_EN_MASK PPC_BITMASK(0, 31) >> +#define X_PC_VPC_SCRUB_TRIG 0x162 >> +#define PC_VPC_SCRUB_TRIG 0x710 >> +#define X_PC_VPC_SCRUB_MASK 0x163 >> +#define PC_VPC_SCRUB_MASK 0x718 >> +#define PC_SCRUB_VALID PPC_BIT(0) >> +#define PC_SCRUB_WANT_DISABLE PPC_BIT(1) >> +#define PC_SCRUB_WANT_INVAL PPC_BIT(2) >> +#define PC_SCRUB_BLOCK_ID PPC_BITMASK(27, 31) >> +#define PC_SCRUB_OFFSET PPC_BITMASK(45, 63) >> +#define X_PC_VPC_CWATCH_SPEC 0x167 >> +#define PC_VPC_CWATCH_SPEC 0x738 >> +#define PC_VPC_CWATCH_CONFLICT PPC_BIT(0) >> +#define PC_VPC_CWATCH_FULL PPC_BIT(8) >> +#define PC_VPC_CWATCH_BLOCKID PPC_BITMASK(27, 31) >> +#define PC_VPC_CWATCH_OFFSET PPC_BITMASK(45, 63) >> +#define X_PC_VPC_CWATCH_DAT0 0x168 >> +#define PC_VPC_CWATCH_DAT0 0x740 >> +#define X_PC_VPC_CWATCH_DAT1 0x169 >> +#define PC_VPC_CWATCH_DAT1 0x748 >> +#define X_PC_VPC_CWATCH_DAT2 0x16a >> +#define PC_VPC_CWATCH_DAT2 0x750 >> +#define X_PC_VPC_CWATCH_DAT3 0x16b >> +#define PC_VPC_CWATCH_DAT3 0x758 >> +#define X_PC_VPC_CWATCH_DAT4 0x16c >> +#define PC_VPC_CWATCH_DAT4 0x760 >> +#define X_PC_VPC_CWATCH_DAT5 0x16d >> +#define PC_VPC_CWATCH_DAT5 0x768 >> +#define X_PC_VPC_CWATCH_DAT6 0x16e >> +#define PC_VPC_CWATCH_DAT6 0x770 >> +#define X_PC_VPC_CWATCH_DAT7 0x16f >> +#define PC_VPC_CWATCH_DAT7 0x778 >> + >> +/* VC0 register offsets 0x800 - 0xFFF */ >> +#define X_VC_GLOBAL_CONFIG 0x200 >> +#define VC_GLOBAL_CONFIG 0x800 >> +#define VC_GCONF_INDIRECT PPC_BIT(32) >> +#define X_VC_VSD_TABLE_ADDR 0x201 >> +#define VC_VSD_TABLE_ADDR 0x808 >> +#define X_VC_VSD_TABLE_DATA 0x202 >> +#define VC_VSD_TABLE_DATA 0x810 >> +#define VC_IVE_ISB_BLOCK_MODE 0x818 >> +#define VC_EQD_BLOCK_MODE 0x820 >> +#define VC_VPS_BLOCK_MODE 0x828 >> +#define X_VC_IRQ_CONFIG_IPI 0x208 >> +#define VC_IRQ_CONFIG_IPI 0x840 >> +#define VC_IRQ_CONFIG_MEMB_EN PPC_BIT(45) >> +#define VC_IRQ_CONFIG_MEMB_SZ PPC_BITMASK(46, 51) >> +#define VC_IRQ_CONFIG_HW 0x848 >> +#define VC_IRQ_CONFIG_CASCADE1 0x850 >> +#define VC_IRQ_CONFIG_CASCADE2 0x858 >> +#define VC_IRQ_CONFIG_REDIST 0x860 >> +#define VC_IRQ_CONFIG_IPI_CASC 0x868 >> +#define X_VC_AIB_TX_ORDER_TAG2 0x22d >> +#define VC_AIB_TX_ORDER_TAG2_REL_TF PPC_BIT(20) >> +#define VC_AIB_TX_ORDER_TAG2 0x890 >> +#define X_VC_AT_MACRO_KILL 0x23e >> +#define VC_AT_MACRO_KILL 0x8b0 >> +#define X_VC_AT_MACRO_KILL_MASK 0x23f >> +#define VC_AT_MACRO_KILL_MASK 0x8b8 >> +#define VC_KILL_VALID PPC_BIT(0) >> +#define VC_KILL_TYPE PPC_BITMASK(14, 15) >> +#define VC_KILL_IRQ 0 >> +#define VC_KILL_IVC 1 >> +#define VC_KILL_SBC 2 >> +#define VC_KILL_EQD 3 >> +#define VC_KILL_BLOCK_ID PPC_BITMASK(27, 31) >> +#define VC_KILL_OFFSET PPC_BITMASK(48, 60) >> +#define X_VC_EQC_CACHE_ENABLE 0x211 >> +#define VC_EQC_CACHE_ENABLE 0x908 >> +#define VC_EQC_CACHE_EN_MASK PPC_BITMASK(0, 15) >> +#define X_VC_EQC_SCRUB_TRIG 0x212 >> +#define VC_EQC_SCRUB_TRIG 0x910 >> +#define X_VC_EQC_SCRUB_MASK 0x213 >> +#define VC_EQC_SCRUB_MASK 0x918 >> +#define X_VC_EQC_CWATCH_SPEC 0x215 >> +#define VC_EQC_CONFIG 0x920 >> +#define X_VC_EQC_CONFIG 0x214 >> +#define VC_EQC_CONF_SYNC_IPI PPC_BIT(32) >> +#define VC_EQC_CONF_SYNC_HW PPC_BIT(33) >> +#define VC_EQC_CONF_SYNC_ESC1 PPC_BIT(34) >> +#define VC_EQC_CONF_SYNC_ESC2 PPC_BIT(35) >> +#define VC_EQC_CONF_SYNC_REDI PPC_BIT(36) >> +#define VC_EQC_CONF_EQP_INTERLEAVE PPC_BIT(38) >> +#define VC_EQC_CONF_ENABLE_END_s_BIT PPC_BIT(39) >> +#define VC_EQC_CONF_ENABLE_END_u_BIT PPC_BIT(40) >> +#define VC_EQC_CONF_ENABLE_END_c_BIT PPC_BIT(41) >> +#define VC_EQC_CONF_ENABLE_MORE_QSZ PPC_BIT(42) >> +#define VC_EQC_CONF_SKIP_ESCALATE PPC_BIT(43) >> +#define VC_EQC_CWATCH_SPEC 0x928 >> +#define VC_EQC_CWATCH_CONFLICT PPC_BIT(0) >> +#define VC_EQC_CWATCH_FULL PPC_BIT(8) >> +#define VC_EQC_CWATCH_BLOCKID PPC_BITMASK(28, 31) >> +#define VC_EQC_CWATCH_OFFSET PPC_BITMASK(40, 63) >> +#define X_VC_EQC_CWATCH_DAT0 0x216 >> +#define VC_EQC_CWATCH_DAT0 0x930 >> +#define X_VC_EQC_CWATCH_DAT1 0x217 >> +#define VC_EQC_CWATCH_DAT1 0x938 >> +#define X_VC_EQC_CWATCH_DAT2 0x218 >> +#define VC_EQC_CWATCH_DAT2 0x940 >> +#define X_VC_EQC_CWATCH_DAT3 0x219 >> +#define VC_EQC_CWATCH_DAT3 0x948 >> +#define X_VC_IVC_SCRUB_TRIG 0x222 >> +#define VC_IVC_SCRUB_TRIG 0x990 >> +#define X_VC_IVC_SCRUB_MASK 0x223 >> +#define VC_IVC_SCRUB_MASK 0x998 >> +#define X_VC_SBC_SCRUB_TRIG 0x232 >> +#define VC_SBC_SCRUB_TRIG 0xa10 >> +#define X_VC_SBC_SCRUB_MASK 0x233 >> +#define VC_SBC_SCRUB_MASK 0xa18 >> +#define VC_SCRUB_VALID PPC_BIT(0) >> +#define VC_SCRUB_WANT_DISABLE PPC_BIT(1) >> +#define VC_SCRUB_WANT_INVAL PPC_BIT(2) /* EQC and SBC only */ >> +#define VC_SCRUB_BLOCK_ID PPC_BITMASK(28, 31) >> +#define VC_SCRUB_OFFSET PPC_BITMASK(40, 63) >> +#define X_VC_IVC_CACHE_ENABLE 0x221 >> +#define VC_IVC_CACHE_ENABLE 0x988 >> +#define VC_IVC_CACHE_EN_MASK PPC_BITMASK(0, 15) >> +#define X_VC_SBC_CACHE_ENABLE 0x231 >> +#define VC_SBC_CACHE_ENABLE 0xa08 >> +#define VC_SBC_CACHE_EN_MASK PPC_BITMASK(0, 15) >> +#define VC_IVC_CACHE_SCRUB_TRIG 0x990 >> +#define VC_IVC_CACHE_SCRUB_MASK 0x998 >> +#define VC_SBC_CACHE_ENABLE 0xa08 >> +#define VC_SBC_CACHE_SCRUB_TRIG 0xa10 >> +#define VC_SBC_CACHE_SCRUB_MASK 0xa18 >> +#define VC_SBC_CONFIG 0xa20 >> +#define X_VC_SBC_CONFIG 0x234 >> +#define VC_SBC_CONF_CPLX_CIST PPC_BIT(44) >> +#define VC_SBC_CONF_CIST_BOTH PPC_BIT(45) >> +#define VC_SBC_CONF_NO_UPD_PRF PPC_BIT(59) >> + >> +/* VC1 register offsets */ >> + >> +/* VSD Table address register definitions (shared) */ >> +#define VST_ADDR_AUTOINC PPC_BIT(0) >> +#define VST_TABLE_SELECT PPC_BITMASK(13, 15) >> +#define VST_TSEL_IVT 0 >> +#define VST_TSEL_SBE 1 >> +#define VST_TSEL_EQDT 2 >> +#define VST_TSEL_VPDT 3 >> +#define VST_TSEL_IRQ 4 /* VC only */ >> +#define VST_TABLE_BLOCK PPC_BITMASK(27, 31) >> + >> +/* Number of queue overflow pages */ >> +#define VC_QUEUE_OVF_COUNT 6 >> + >> +/* Bits in a VSD entry. >> + * >> + * Note: the address is naturally aligned, we don't use a PPC_BITMAS= K, >> + * but just a mask to apply to the address before OR'ing it in. >> + * >> + * Note: VSD_FIRMWARE is a SW bit ! It hijacks an unused bit in the >> + * VSD and is only meant to be used in indirect mode ! >> + */ >> +#define VSD_MODE PPC_BITMASK(0, 1) >> +#define VSD_MODE_SHARED 1 >> +#define VSD_MODE_EXCLUSIVE 2 >> +#define VSD_MODE_FORWARD 3 >> +#define VSD_ADDRESS_MASK 0x0ffffffffffff000ull >> +#define VSD_MIGRATION_REG PPC_BITMASK(52, 55) >> +#define VSD_INDIRECT PPC_BIT(56) >> +#define VSD_TSIZE PPC_BITMASK(59, 63) >> +#define VSD_FIRMWARE PPC_BIT(2) /* Read warning above */ >> + >> +#define VC_EQC_SYNC_MASK \ >> + (VC_EQC_CONF_SYNC_IPI | \ >> + VC_EQC_CONF_SYNC_HW | \ >> + VC_EQC_CONF_SYNC_ESC1 | \ >> + VC_EQC_CONF_SYNC_ESC2 | \ >> + VC_EQC_CONF_SYNC_REDI) >> + >> + >> +#endif /* PPC_PNV_XIVE_REGS_H */ >> diff --git a/include/hw/ppc/pnv.h b/include/hw/ppc/pnv.h >> index 86d5f54e5459..402dd8f6452c 100644 >> --- a/include/hw/ppc/pnv.h >> +++ b/include/hw/ppc/pnv.h >> @@ -25,6 +25,7 @@ >> #include "hw/ppc/pnv_lpc.h" >> #include "hw/ppc/pnv_psi.h" >> #include "hw/ppc/pnv_occ.h" >> +#include "hw/ppc/pnv_xive.h" >> =20 >> #define TYPE_PNV_CHIP "pnv-chip" >> #define PNV_CHIP(obj) OBJECT_CHECK(PnvChip, (obj), TYPE_PNV_CHIP) >> @@ -82,6 +83,7 @@ typedef struct Pnv9Chip { >> PnvChip parent_obj; >> =20 >> /*< public >*/ >> + PnvXive xive; >> } Pnv9Chip; >> =20 >> typedef struct PnvChipClass { >> @@ -205,7 +207,6 @@ void pnv_bmc_powerdown(IPMIBmc *bmc); >> #define PNV_ICP_BASE(chip) = \ >> (0x0003ffff80000000ull + (uint64_t) PNV_CHIP_INDEX(chip) * PNV_IC= P_SIZE) >> =20 >> - >> #define PNV_PSIHB_SIZE 0x0000000000100000ull >> #define PNV_PSIHB_BASE(chip) \ >> (0x0003fffe80000000ull + (uint64_t)PNV_CHIP_INDEX(chip) * PNV_PSI= HB_SIZE) >> @@ -215,4 +216,23 @@ void pnv_bmc_powerdown(IPMIBmc *bmc); >> (0x0003ffe000000000ull + (uint64_t)PNV_CHIP_INDEX(chip) * \ >> PNV_PSIHB_FSP_SIZE) >> =20 >> +/* >> + * POWER9 MMIO base addresses >> + */ >> +#define PNV9_CHIP_BASE(chip, base) \ >> + ((base) + ((uint64_t) (chip)->chip_id << 42)) >> + >> +#define PNV9_XIVE_VC_SIZE 0x0000008000000000ull >> +#define PNV9_XIVE_VC_BASE(chip) PNV9_CHIP_BASE(chip, 0x000601000= 0000000ull) >> + >> +#define PNV9_XIVE_PC_SIZE 0x0000001000000000ull >> +#define PNV9_XIVE_PC_BASE(chip) PNV9_CHIP_BASE(chip, 0x000601800= 0000000ull) >> + >> +#define PNV9_XIVE_IC_SIZE 0x0000000000080000ull >> +#define PNV9_XIVE_IC_BASE(chip) PNV9_CHIP_BASE(chip, 0x000603020= 3100000ull) >> + >> +#define PNV9_XIVE_TM_SIZE 0x0000000000040000ull >> +#define PNV9_XIVE_TM_BASE(chip) PNV9_CHIP_BASE(chip, 0x000603020= 3180000ull) >> + >> + >> #endif /* _PPC_PNV_H */ >> diff --git a/include/hw/ppc/pnv_xive.h b/include/hw/ppc/pnv_xive.h >> new file mode 100644 >> index 000000000000..5b64d4cafe8f >> --- /dev/null >> +++ b/include/hw/ppc/pnv_xive.h >> @@ -0,0 +1,100 @@ >> +/* >> + * QEMU PowerPC XIVE interrupt controller model >> + * >> + * Copyright (c) 2017-2018, IBM Corporation. >> + * >> + * This code is licensed under the GPL version 2 or later. See the >> + * COPYING file in the top-level directory. >> + */ >> + >> +#ifndef PPC_PNV_XIVE_H >> +#define PPC_PNV_XIVE_H >> + >> +#include "hw/sysbus.h" >> +#include "hw/ppc/xive.h" >> + >> +#define TYPE_PNV_XIVE "pnv-xive" >> +#define PNV_XIVE(obj) OBJECT_CHECK(PnvXive, (obj), TYPE_PNV_XIVE) >> + >> +#define XIVE_BLOCK_MAX 16 >> + >> +#define XIVE_XLATE_BLK_MAX 16 /* Block Scope Table (0-15) */ >> +#define XIVE_XLATE_MIG_MAX 16 /* Migration Register Table (1-15) */ >> +#define XIVE_XLATE_VDT_MAX 16 /* VDT Domain Table (0-15) */ >> +#define XIVE_XLATE_EDT_MAX 64 /* EDT Domain Table (0-63) */ >> + >> +typedef struct PnvXive { >> + XiveRouter parent_obj; >> + >> + /* Can be overridden by XIVE configuration */ >> + uint32_t thread_chip_id; >> + uint32_t chip_id; >=20 > These have similar names but they're very different AFAICT - one is > static configuration, the other runtime state. I'd generally order > structures so that configuration information is in one block, computed > at initialization then static in another, then runtime state in a > third - it's both clearer and (usually) more cache efficient. yes. This is a good pratice.=20 =20 > Sometimes that's less important that other logical groupings, but I > don't think that's the case here. >=20 >> + >> + /* Interrupt controller regs */ >> + uint64_t regs[0x300]; >> + MemoryRegion xscom_regs; >> + >> + /* For IPIs and accelerator interrupts */ >> + uint32_t nr_irqs; >> + XiveSource source; >> + >> + uint32_t nr_ends; >> + XiveENDSource end_source; >> + >> + /* Cache update registers */ >> + uint64_t eqc_watch[4]; >> + uint64_t vpc_watch[8]; >> + >> + /* Virtual Structure Table Descriptors : EAT, SBE, ENDT, NVTT, IR= Q */ >> + uint64_t vsds[5][XIVE_BLOCK_MAX]; >> + >> + /* Set Translation tables */ >> + bool set_xlate_autoinc; >> + uint64_t set_xlate_index; >> + uint64_t set_xlate; >> + >> + uint64_t set_xlate_blk[XIVE_XLATE_BLK_MAX]; >> + uint64_t set_xlate_mig[XIVE_XLATE_MIG_MAX]; >> + uint64_t set_xlate_vdt[XIVE_XLATE_VDT_MAX]; >> + uint64_t set_xlate_edt[XIVE_XLATE_EDT_MAX]; >> + >> + /* Interrupt controller MMIO */ >> + hwaddr ic_base; >> + uint32_t ic_shift; >> + MemoryRegion ic_mmio; >> + MemoryRegion ic_reg_mmio; >> + MemoryRegion ic_notify_mmio; >> + >> + /* VC memory regions */ >> + hwaddr vc_base; >> + uint64_t vc_size; >> + uint32_t vc_shift; >> + MemoryRegion vc_mmio; >> + >> + /* IPI and END address space to model the EDT segmentation */ >> + uint32_t edt_shift; >> + MemoryRegion ipi_mmio; >> + AddressSpace ipi_as; >> + MemoryRegion end_mmio; >> + AddressSpace end_as; >> + >> + /* PC memory regions */ >> + hwaddr pc_base; >> + uint64_t pc_size; >> + uint32_t pc_shift; >> + MemoryRegion pc_mmio; >> + uint32_t vdt_shift; >> + >> + /* TIMA memory regions */ >> + hwaddr tm_base; >> + uint32_t tm_shift; >> + MemoryRegion tm_mmio; >> + MemoryRegion tm_mmio_indirect; >> + >> + /* CPU for indirect TIMA access */ >> + PowerPCCPU *cpu_ind; >> +} PnvXive; >> + >> +void pnv_xive_pic_print_info(PnvXive *xive, Monitor *mon); >> + >> +#endif /* PPC_PNV_XIVE_H */ >> diff --git a/include/hw/ppc/pnv_xscom.h b/include/hw/ppc/pnv_xscom.h >> index 255b26a5aaf6..6623ec54a7a8 100644 >> --- a/include/hw/ppc/pnv_xscom.h >> +++ b/include/hw/ppc/pnv_xscom.h >> @@ -73,6 +73,9 @@ typedef struct PnvXScomInterfaceClass { >> #define PNV_XSCOM_OCC_BASE 0x0066000 >> #define PNV_XSCOM_OCC_SIZE 0x6000 >> =20 >> +#define PNV9_XSCOM_XIVE_BASE 0x5013000 >> +#define PNV9_XSCOM_XIVE_SIZE 0x300 >> + >> extern void pnv_xscom_realize(PnvChip *chip, Error **errp); >> extern int pnv_dt_xscom(PnvChip *chip, void *fdt, int offset); >> =20 >> diff --git a/include/hw/ppc/xive.h b/include/hw/ppc/xive.h >> index c8201462d698..6089511cff83 100644 >> --- a/include/hw/ppc/xive.h >> +++ b/include/hw/ppc/xive.h >> @@ -237,6 +237,7 @@ int xive_router_get_nvt(XiveRouter *xrtr, uint8_t = nvt_blk, uint32_t nvt_idx, >> XiveNVT *nvt); >> int xive_router_set_nvt(XiveRouter *xrtr, uint8_t nvt_blk, uint32_t n= vt_idx, >> XiveNVT *nvt); >> +void xive_router_notify(XiveFabric *xf, uint32_t lisn); >> =20 >> /* >> * XIVE END ESBs >> diff --git a/hw/intc/pnv_xive.c b/hw/intc/pnv_xive.c >> new file mode 100644 >> index 000000000000..9f0c41cdb750 >> --- /dev/null >> +++ b/hw/intc/pnv_xive.c >> @@ -0,0 +1,1612 @@ >> +/* >> + * QEMU PowerPC XIVE interrupt controller model >> + * >> + * Copyright (c) 2017-2018, IBM Corporation. >> + * >> + * This code is licensed under the GPL version 2 or later. See the >> + * COPYING file in the top-level directory. >> + */ >> + >> +#include "qemu/osdep.h" >> +#include "qemu/log.h" >> +#include "qapi/error.h" >> +#include "target/ppc/cpu.h" >> +#include "sysemu/cpus.h" >> +#include "sysemu/dma.h" >> +#include "monitor/monitor.h" >> +#include "hw/ppc/fdt.h" >> +#include "hw/ppc/pnv.h" >> +#include "hw/ppc/pnv_xscom.h" >> +#include "hw/ppc/pnv_xive.h" >> +#include "hw/ppc/xive_regs.h" >> +#include "hw/ppc/ppc.h" >> + >> +#include >> + >> +#include "pnv_xive_regs.h" >> + >> +/* >> + * Interrupt source number encoding >> + */ >> +#define SRCNO_BLOCK(srcno) (((srcno) >> 28) & 0xf) >> +#define SRCNO_INDEX(srcno) ((srcno) & 0x0fffffff) >> +#define XIVE_SRCNO(blk, idx) ((uint32_t)(blk) << 28 | (idx)) >> + >> +/* >> + * Virtual structures table accessors >> + */ >> +typedef struct XiveVstInfo { >> + const char *name; >> + uint32_t size; >> + uint32_t max_blocks; >> +} XiveVstInfo; >> + >> +static const XiveVstInfo vst_infos[] =3D { >> + [VST_TSEL_IVT] =3D { "EAT", sizeof(XiveEAS), 16 }, >> + [VST_TSEL_SBE] =3D { "SBE", 0, 16 }, >> + [VST_TSEL_EQDT] =3D { "ENDT", sizeof(XiveEND), 16 }, >> + [VST_TSEL_VPDT] =3D { "VPDT", sizeof(XiveNVT), 32 }, >=20 > Are those VST_TSEL_* things named in the XIVE documentation? It not, > you probably want to rename them to reflect the new-style naming. The register documentation still refers to IVE, ESB, EQD, VPD ... >=20 >> + /* Interrupt fifo backing store table : >> + * >> + * 0 - IPI, >> + * 1 - HWD, >> + * 2 - First escalate, >> + * 3 - Second escalate, >> + * 4 - Redistribution, >> + * 5 - IPI cascaded queue ? >> + */ >> + [VST_TSEL_IRQ] =3D { "IRQ", 0, 6 }, >> +}; >> + >> +#define xive_error(xive, fmt, ...) = \ >> + qemu_log_mask(LOG_GUEST_ERROR, "XIVE[%x] - " fmt "\n", (xive)->ch= ip_id, \ >> + ## __VA_ARGS__); >> + >> +/* >> + * Our lookup routine for a remote XIVE IC. A simple scan of the chip= s. >> + */ >> +static PnvXive *pnv_xive_get_ic(PnvXive *xive, uint8_t blk) >> +{ >> + PnvMachineState *pnv =3D PNV_MACHINE(qdev_get_machine()); >> + int i; >> + >> + for (i =3D 0; i < pnv->num_chips; i++) { >> + Pnv9Chip *chip9 =3D PNV9_CHIP(pnv->chips[i]); >> + PnvXive *ic_xive =3D &chip9->xive; >> + bool chip_override =3D >> + ic_xive->regs[PC_GLOBAL_CONFIG >> 3] & PC_GCONF_CHIPID_OV= R; >> + >> + if (chip_override) { >> + if (ic_xive->chip_id =3D=3D blk) { >> + return ic_xive; >> + } >> + } else { >> + ; /* TODO: Block scope support */ >> + } >> + } >> + xive_error(xive, "VST: unknown chip/block %d !?", blk); >> + return NULL; >> +} >> + >> +/* >> + * Virtual Structures Table accessors for SBE, EAT, ENDT, NVT >> + */ >> +static uint64_t pnv_xive_vst_addr_direct(PnvXive *xive, >> + const XiveVstInfo *info, uin= t64_t vsd, >> + uint8_t blk, uint32_t idx) >> +{ >> + uint64_t vst_addr =3D vsd & VSD_ADDRESS_MASK; >> + uint64_t vst_tsize =3D 1ull << (GETFIELD(VSD_TSIZE, vsd) + 12); >> + uint32_t idx_max =3D (vst_tsize / info->size) - 1; >> + >> + if (idx > idx_max) { >> +#ifdef XIVE_DEBUG >> + xive_error(xive, "VST: %s entry %x/%x out of range !?", info-= >name, >> + blk, idx); >> +#endif >> + return 0; >> + } >> + >> + return vst_addr + idx * info->size; >> +} >> + >> +#define XIVE_VSD_SIZE 8 >> + >> +static uint64_t pnv_xive_vst_addr_indirect(PnvXive *xive, >> + const XiveVstInfo *info, >> + uint64_t vsd, uint8_t blk, >> + uint32_t idx) >> +{ >> + uint64_t vsd_addr; >> + uint64_t vst_addr; >> + uint32_t page_shift; >> + uint32_t page_mask; >> + uint64_t vst_tsize =3D 1ull << (GETFIELD(VSD_TSIZE, vsd) + 12); >> + uint32_t idx_max =3D (vst_tsize / XIVE_VSD_SIZE) - 1; >> + >> + if (idx > idx_max) { >> +#ifdef XIVE_DEBUG >> + xive_error(xive, "VET: %s entry %x/%x out of range !?", info-= >name, >> + blk, idx); >> +#endif >> + return 0; >> + } >> + >> + vsd_addr =3D vsd & VSD_ADDRESS_MASK; >> + >> + /* >> + * Read the first descriptor to get the page size of each indirec= t >> + * table. >> + */ >> + vsd =3D ldq_be_dma(&address_space_memory, vsd_addr); >> + page_shift =3D GETFIELD(VSD_TSIZE, vsd) + 12; >> + page_mask =3D (1ull << page_shift) - 1; >> + >> + /* Indirect page size can be 4K, 64K, 2M. */ >> + if (page_shift !=3D 12 && page_shift !=3D 16 && page_shift !=3D 2= 3) { >=20 > page_shift =3D=3D 23?? That's 8 MiB. ooups. and I should add 16M also. >=20 >> + xive_error(xive, "VST: invalid %s table shift %d", info->name= , >> + page_shift); >> + } >> + >> + if (!(vsd & VSD_ADDRESS_MASK)) { >> + xive_error(xive, "VST: invalid %s entry %x/%x !?", info->name= , >> + blk, 0); >> + return 0; >> + } >> + >> + /* Load the descriptor we are looking for, if not already done */ >> + if (idx) { >> + vsd_addr =3D vsd_addr + (idx >> page_shift); >> + vsd =3D ldq_be_dma(&address_space_memory, vsd_addr); >> + >> + if (page_shift !=3D GETFIELD(VSD_TSIZE, vsd) + 12) { >> + xive_error(xive, "VST: %s entry %x/%x indirect page size = differ !?", >> + info->name, blk, idx); >> + return 0; >> + } >> + } >> + >> + vst_addr =3D vsd & VSD_ADDRESS_MASK; >> + >> + return vst_addr + (idx & page_mask) * info->size; >> +} >> + >> +static uint64_t pnv_xive_vst_addr(PnvXive *xive, uint8_t type, uint8_= t blk, >> + uint32_t idx) >> +{ >> + uint64_t vsd; >> + >> + if (blk >=3D vst_infos[type].max_blocks) { >> + xive_error(xive, "VST: invalid block id %d for VST %s %d !?", >> + blk, vst_infos[type].name, idx); >> + return 0; >> + } >> + >> + vsd =3D xive->vsds[type][blk]; >> + >> + /* Remote VST accesses */ >> + if (GETFIELD(VSD_MODE, vsd) =3D=3D VSD_MODE_FORWARD) { >> + xive =3D pnv_xive_get_ic(xive, blk); >> + >> + return xive ? pnv_xive_vst_addr(xive, type, blk, idx) : 0; >> + } >> + >> + if (VSD_INDIRECT & vsd) { >> + return pnv_xive_vst_addr_indirect(xive, &vst_infos[type], vsd= , >> + blk, idx); >> + } >> + >> + return pnv_xive_vst_addr_direct(xive, &vst_infos[type], vsd, blk,= idx); >> +} >> + >> +static int pnv_xive_get_end(XiveRouter *xrtr, uint8_t blk, uint32_t i= dx, >> + XiveEND *end) >> +{ >> + PnvXive *xive =3D PNV_XIVE(xrtr); >> + uint64_t end_addr =3D pnv_xive_vst_addr(xive, VST_TSEL_EQDT, blk,= idx); >> + >> + if (!end_addr) { >> + return -1; >> + } >> + >> + cpu_physical_memory_read(end_addr, end, sizeof(XiveEND)); >> + end->w0 =3D be32_to_cpu(end->w0); >> + end->w1 =3D be32_to_cpu(end->w1); >> + end->w2 =3D be32_to_cpu(end->w2); >> + end->w3 =3D be32_to_cpu(end->w3); >> + end->w4 =3D be32_to_cpu(end->w4); >> + end->w5 =3D be32_to_cpu(end->w5); >> + end->w6 =3D be32_to_cpu(end->w6); >> + end->w7 =3D be32_to_cpu(end->w7); >> + >> + return 0; >> +} >> + >> +static int pnv_xive_set_end(XiveRouter *xrtr, uint8_t blk, uint32_t i= dx, >> + XiveEND *in_end) >> +{ >> + PnvXive *xive =3D PNV_XIVE(xrtr); >> + XiveEND end; >> + uint64_t end_addr =3D pnv_xive_vst_addr(xive, VST_TSEL_EQDT, blk,= idx); >> + >> + if (!end_addr) { >> + return -1; >> + } >> + >> + end.w0 =3D cpu_to_be32(in_end->w0); >> + end.w1 =3D cpu_to_be32(in_end->w1); >> + end.w2 =3D cpu_to_be32(in_end->w2); >> + end.w3 =3D cpu_to_be32(in_end->w3); >> + end.w4 =3D cpu_to_be32(in_end->w4); >> + end.w5 =3D cpu_to_be32(in_end->w5); >> + end.w6 =3D cpu_to_be32(in_end->w6); >> + end.w7 =3D cpu_to_be32(in_end->w7); >> + cpu_physical_memory_write(end_addr, &end, sizeof(XiveEND)); >> + return 0; >> +} >> + >> +static int pnv_xive_end_update(PnvXive *xive, uint8_t blk, uint32_t i= dx) >> +{ >> + uint64_t end_addr =3D pnv_xive_vst_addr(xive, VST_TSEL_EQDT, blk,= idx); >> + >> + if (!end_addr) { >> + return -1; >> + } >> + >> + cpu_physical_memory_write(end_addr, xive->eqc_watch, sizeof(XiveE= ND)); >> + return 0; >> +} >> + >> +static int pnv_xive_get_nvt(XiveRouter *xrtr, uint8_t blk, uint32_t i= dx, >> + XiveNVT *nvt) >> +{ >> + PnvXive *xive =3D PNV_XIVE(xrtr); >> + uint64_t nvt_addr =3D pnv_xive_vst_addr(xive, VST_TSEL_VPDT, blk,= idx); >> + >> + if (!nvt_addr) { >> + return -1; >> + } >> + >> + cpu_physical_memory_read(nvt_addr, nvt, sizeof(XiveNVT)); >> + nvt->w0 =3D cpu_to_be32(nvt->w0); >> + nvt->w1 =3D cpu_to_be32(nvt->w1); >> + nvt->w2 =3D cpu_to_be32(nvt->w2); >> + nvt->w3 =3D cpu_to_be32(nvt->w3); >> + nvt->w4 =3D cpu_to_be32(nvt->w4); >> + nvt->w5 =3D cpu_to_be32(nvt->w5); >> + nvt->w6 =3D cpu_to_be32(nvt->w6); >> + nvt->w7 =3D cpu_to_be32(nvt->w7); >> + >> + return 0; >> +} >> + >> +static int pnv_xive_set_nvt(XiveRouter *xrtr, uint8_t blk, uint32_t i= dx, >> + XiveNVT *in_nvt) >> +{ >> + PnvXive *xive =3D PNV_XIVE(xrtr); >> + XiveNVT nvt; >> + uint64_t nvt_addr =3D pnv_xive_vst_addr(xive, VST_TSEL_VPDT, blk,= idx); >> + >> + if (!nvt_addr) { >> + return -1; >> + } >> + >> + nvt.w0 =3D cpu_to_be32(in_nvt->w0); >> + nvt.w1 =3D cpu_to_be32(in_nvt->w1); >> + nvt.w2 =3D cpu_to_be32(in_nvt->w2); >> + nvt.w3 =3D cpu_to_be32(in_nvt->w3); >> + nvt.w4 =3D cpu_to_be32(in_nvt->w4); >> + nvt.w5 =3D cpu_to_be32(in_nvt->w5); >> + nvt.w6 =3D cpu_to_be32(in_nvt->w6); >> + nvt.w7 =3D cpu_to_be32(in_nvt->w7); >> + cpu_physical_memory_write(nvt_addr, &nvt, sizeof(XiveNVT)); >> + return 0; >> +} >> + >> +static int pnv_xive_nvt_update(PnvXive *xive, uint8_t blk, uint32_t i= dx) >> +{ >> + uint64_t nvt_addr =3D pnv_xive_vst_addr(xive, VST_TSEL_VPDT, blk,= idx); >> + >> + if (!nvt_addr) { >> + return -1; >> + } >> + >> + cpu_physical_memory_write(nvt_addr, xive->vpc_watch, sizeof(XiveN= VT)); >> + return 0; >> +} >> + >> +static int pnv_xive_get_eas(XiveRouter *xrtr, uint32_t srcno, XiveEAS= *eas) >> +{ >> + PnvXive *xive =3D PNV_XIVE(xrtr); >> + uint8_t blk =3D SRCNO_BLOCK(srcno); >> + uint32_t idx =3D SRCNO_INDEX(srcno); >> + uint64_t eas_addr; >> + >> + /* TODO: check when remote EAS lookups are possible */ >> + if (pnv_xive_get_ic(xive, blk) !=3D xive) { >> + xive_error(xive, "VST: EAS %x is remote !?", srcno); >> + return -1; >> + } >> + >> + eas_addr =3D pnv_xive_vst_addr(xive, VST_TSEL_IVT, blk, idx); >> + if (!eas_addr) { >> + return -1; >> + } >> + >> + eas->w &=3D ~EAS_VALID; >=20 > Doesn't this get overwritten by the next statement? yes .. =20 >> + *((uint64_t *) eas) =3D ldq_be_dma(&address_space_memory, eas_add= r); >=20 > eas->w =3D ldq... would surely be simpler. yes.=20 we are changing the XIVE core layer to use XIVE structures in BE. It shou= ld simplify all the accessors. >=20 >> + return 0; >> +} >> + >> +static int pnv_xive_set_eas(XiveRouter *xrtr, uint32_t srcno, XiveEAS= *ive) >> +{ >> + /* All done. */ >=20 > Uh.. what? This is wrong, although I guess it doesn't matter because > the pnv model never uses set_eas. Another argument for not > abstracting this path - just write directly in the PAPR code. he :) >=20 >> + return 0; >> +} >> + >> +static int pnv_xive_eas_update(PnvXive *xive, uint32_t idx) >> +{ >> + /* All done. */ >> + return 0; >> +} >> + >> +/* >> + * XIVE Set Translation Table configuration >> + * >> + * The Virtualization Controller MMIO region containing the IPI ESB >> + * pages and END ESB pages is sub-divided into "sets" which map >> + * portions of the VC region to the different ESB pages. It is >> + * configured at runtime through the EDT set translation table to let >> + * the firmware decide how to split the address space between IPI ESB >> + * pages and END ESB pages. >> + */ >> +static int pnv_xive_set_xlate_update(PnvXive *xive, uint64_t val) >> +{ >> + uint8_t index =3D xive->set_xlate_autoinc ? >> + xive->set_xlate_index++ : xive->set_xlate_index; >=20 > What's the correct hardware behaviour when the index runs off the end > with autoincrement mode? It doesn't say ... the index is on 6bits, I suppose it should wrap up at 63. =20 >> + uint8_t max_index; >> + uint64_t *xlate_table; >> + >> + switch (xive->set_xlate) { >> + case CQ_TAR_TSEL_BLK: >> + max_index =3D ARRAY_SIZE(xive->set_xlate_blk); >> + xlate_table =3D xive->set_xlate_blk; >> + break; >> + case CQ_TAR_TSEL_MIG: >> + max_index =3D ARRAY_SIZE(xive->set_xlate_mig); >> + xlate_table =3D xive->set_xlate_mig; >> + break; >> + case CQ_TAR_TSEL_EDT: >> + max_index =3D ARRAY_SIZE(xive->set_xlate_edt); >> + xlate_table =3D xive->set_xlate_edt; >> + break; >> + case CQ_TAR_TSEL_VDT: >> + max_index =3D ARRAY_SIZE(xive->set_xlate_vdt); >> + xlate_table =3D xive->set_xlate_vdt; >> + break; >> + default: >> + xive_error(xive, "xlate: invalid table %d", (int) xive->set_x= late); >=20 > In the error case is it correct for the autoincrement to go ahead? ah, no, it isn't. I need to change the logic. >> + return -1; >> + } >> + >> + if (index >=3D max_index) { >> + return -1; >> + } >> + >> + xlate_table[index] =3D val; >> + return 0; >> +} >> + >> +static int pnv_xive_set_xlate_select(PnvXive *xive, uint64_t val) >> +{ >> + xive->set_xlate_autoinc =3D val & CQ_TAR_TBL_AUTOINC; >> + xive->set_xlate =3D val & CQ_TAR_TSEL; >> + xive->set_xlate_index =3D GETFIELD(CQ_TAR_TSEL_INDEX, val); >=20 > Why split this here, rather than just storing the MMIOed value direct > in the regs[] array, then parsing out the bits when you need them? There is no strong reason really. Mostly because the "Set Translation=20 Table Address" register is set before "Set Translation Table" and it was one way to prepare the work to be done. But It doesn't do=20 much so I could use the MMIOed value directly as you propose. > To expand a bit, there are two models you can use for modelling > registers in qemu. You can have a big regs[] with all the registers > make the accessors just read/write that, plus side-effect and special > case handling. Or you can have specific fields in your state for the > crucial register values, then have the MMIO access do all the > translation into those underlying registers based on the offset. >=20 > Either model can make sense, depending on how many side effects and > special cases there are. Mixing the two models, which is kind of what > you're doing here, is usually not a good idea. I agree. It also makes the vmstate more complex to figure out, even if we don't care for PowerNV.=20 =20 >> + >> + return 0; >> +} >> + >> +/* >> + * Computes the overall size of the IPI or the END ESB pages >> + */ >> +static uint64_t pnv_xive_set_xlate_edt_size(PnvXive *xive, uint64_t t= ype) >> +{ >> + uint64_t edt_size =3D 1ull << xive->edt_shift; >> + uint64_t size =3D 0; >> + int i; >> + >> + for (i =3D 0; i < XIVE_XLATE_EDT_MAX; i++) { >> + uint64_t edt_type =3D GETFIELD(CQ_TDR_EDT_TYPE, xive->set_xla= te_edt[i]); >> + >> + if (edt_type =3D=3D type) { >> + size +=3D edt_size; >> + } >> + } >> + >> + return size; >> +} >> + >> +/* >> + * Maps an offset of the VC region in the IPI or END region using the >> + * layout defined by the EDT table >> + */ >> +static uint64_t pnv_xive_set_xlate_edt_offset(PnvXive *xive, uint64_t= vc_offset, >> + uint64_t type) >> +{ >> + int i; >> + uint64_t edt_size =3D (1ull << xive->edt_shift); >> + uint64_t edt_offset =3D vc_offset; >> + >> + for (i =3D 0; i < XIVE_XLATE_EDT_MAX && (i * edt_size) < vc_offse= t; i++) { >> + uint64_t edt_type =3D GETFIELD(CQ_TDR_EDT_TYPE, xive->set_xla= te_edt[i]); >> + >> + if (edt_type !=3D type) { >> + edt_offset -=3D edt_size; >> + } >> + } >> + >> + return edt_offset; >> +} >> + >> +/* >> + * IPI and END sources realize routines >> + * >> + * We use the EDT table to size the internal XiveSource object backin= g >> + * the IPIs and the XiveENDSource object backing the ENDs >> + */ >> +static void pnv_xive_source_realize(PnvXive *xive, Error **errp) >> +{ >> + XiveSource *xsrc =3D &xive->source; >> + Error *local_err =3D NULL; >> + uint64_t ipi_mmio_size =3D pnv_xive_set_xlate_edt_size(xive, CQ_T= DR_EDT_IPI); >> + >> + /* Two pages per IRQ */ >> + xive->nr_irqs =3D ipi_mmio_size / (1ull << (xive->vc_shift + 1)); >> + >> + /* >> + * Configure store EOI if required by firwmare (skiboot has >> + * removed support recently though) >> + */ >> + if (xive->regs[VC_SBC_CONFIG >> 3] & >> + (VC_SBC_CONF_CPLX_CIST | VC_SBC_CONF_CIST_BOTH)) { >> + object_property_set_int(OBJECT(xsrc), XIVE_SRC_STORE_EOI, "fl= ags", >> + &error_fatal); >> + } >> + >> + object_property_set_int(OBJECT(xsrc), xive->nr_irqs, "nr-irqs", >> + &error_fatal); >> + object_property_add_const_link(OBJECT(xsrc), "xive", OBJECT(xive)= , >> + &error_fatal); >> + object_property_set_bool(OBJECT(xsrc), true, "realized", &local_e= rr); >> + if (local_err) { >> + error_propagate(errp, local_err); >> + return; >> + } >> + qdev_set_parent_bus(DEVICE(xsrc), sysbus_get_default()); >> + >> + /* Install the IPI ESB MMIO region in its VC region */ >> + memory_region_add_subregion(&xive->ipi_mmio, 0, &xsrc->esb_mmio); >> + >> + /* Start in a clean state */ >> + device_reset(DEVICE(&xive->source)); >=20 > I don't think you should need that. During qemu start up all the > device reset handlers should be called after reset but before starting > the VM anyway. yes but I choose to realize the source after reset ... I will explain=20 why later. =20 =20 >> +} >> + >> +static void pnv_xive_end_source_realize(PnvXive *xive, Error **errp) >> +{ >> + XiveENDSource *end_xsrc =3D &xive->end_source; >> + Error *local_err =3D NULL; >> + uint64_t end_mmio_size =3D pnv_xive_set_xlate_edt_size(xive, CQ_T= DR_EDT_EQ); >> + >> + /* Two pages per END: ESn and ESe */ >> + xive->nr_ends =3D end_mmio_size / (1ull << (xive->vc_shift + 1))= ; >> + >> + object_property_set_int(OBJECT(end_xsrc), xive->nr_ends, "nr-ends= ", >> + &error_fatal); >> + object_property_add_const_link(OBJECT(end_xsrc), "xive", OBJECT(x= ive), >> + &error_fatal); >> + object_property_set_bool(OBJECT(end_xsrc), true, "realized", &loc= al_err); >> + if (local_err) { >> + error_propagate(errp, local_err); >> + return; >> + } >> + qdev_set_parent_bus(DEVICE(end_xsrc), sysbus_get_default()); >> + >> + /* Install the END ESB MMIO region in its VC region */ >> + memory_region_add_subregion(&xive->end_mmio, 0, &end_xsrc->esb_mm= io); >> +} >> + >> +/* >> + * Virtual Structure Tables (VST) configuration >> + */ >> +static void pnv_xive_table_set_exclusive(PnvXive *xive, uint8_t type, >> + uint8_t blk, uint64_t vsd) >> +{ >> + bool gconf_indirect =3D >> + xive->regs[VC_GLOBAL_CONFIG >> 3] & VC_GCONF_INDIRECT; >> + uint32_t vst_shift =3D GETFIELD(VSD_TSIZE, vsd) + 12; >> + uint64_t vst_addr =3D vsd & VSD_ADDRESS_MASK; >> + >> + if (VSD_INDIRECT & vsd) { >> + if (!gconf_indirect) { >> + xive_error(xive, "VST: %s indirect tables not enabled", >> + vst_infos[type].name); >> + return; >> + } >> + } >> + >> + switch (type) { >> + case VST_TSEL_IVT: >> + /* >> + * This is our trigger to create the XiveSource object backin= g >> + * the IPIs. >> + */ >> + pnv_xive_source_realize(xive, &error_fatal); >=20 > IIUC this gets called in response to an MMIO. Realizing devices in > response to a runtime MMIO looks very wrong. It does. I agree but I didn't find the appropriate modeling for=20 the problem I am trying to solve. Which is to create a XiveSource=20 object of the appropriate size, depending on how the software=20 configured the XIVE IC.=20 We could choose to cover the maximum that the VC MMIO region can=20 cover. I might do that in next version.=20 >=20 >> + break; >> + >> + case VST_TSEL_EQDT: >> + /* Same trigger but for the XiveENDSource object backing the = ENDs. */ >> + pnv_xive_end_source_realize(xive, &error_fatal); >> + break; >> + >> + case VST_TSEL_VPDT: >> + /* FIXME (skiboot) : remove DD1 workaround on the NVT table s= ize */ >> + vst_shift =3D 16; >> + break; >> + >> + case VST_TSEL_SBE: /* Not modeled */ >> + /* >> + * Contains the backing store pages for the source PQ bits. >> + * The XiveSource object has its own. We would need a custom >> + * source object to use this backing. >> + */ >> + break; >> + >> + case VST_TSEL_IRQ: /* VC only. Not modeled */ >> + /* >> + * These tables contains the backing store pages for the >> + * interrupt fifos of the VC sub-engine in case of overflow. >> + */ >> + break; >> + default: >> + g_assert_not_reached(); >> + } >> + >> + if (!QEMU_IS_ALIGNED(vst_addr, 1ull << vst_shift)) { >> + xive_error(xive, "VST: %s table address 0x%"PRIx64" is not al= igned with" >> + " page shift %d", vst_infos[type].name, vst_addr, = vst_shift); >> + } >> + >> + /* Keep the VSD for later use */ >> + xive->vsds[type][blk] =3D vsd; So I should use the MMIOed value also for such configuration and not=20 store the value in its own field I suppose ? >> +} >> + >> +/* >> + * Both PC and VC sub-engines are configured as each use the Virtual >> + * Structure Tables : SBE, EAS, END and NVT. >> + */ >> +static void pnv_xive_table_set_data(PnvXive *xive, uint64_t vsd, bool= pc_engine) >> +{ >> + uint8_t mode =3D GETFIELD(VSD_MODE, vsd); >> + uint8_t type =3D GETFIELD(VST_TABLE_SELECT, >> + xive->regs[VC_VSD_TABLE_ADDR >> 3]); >> + uint8_t blk =3D GETFIELD(VST_TABLE_BLOCK, >> + xive->regs[VC_VSD_TABLE_ADDR >> 3]); >> + uint64_t vst_addr =3D vsd & VSD_ADDRESS_MASK; >> + >> + if (type > VST_TSEL_IRQ) { >> + xive_error(xive, "VST: invalid table type %d", type); >> + return; >> + } >> + >> + if (blk >=3D vst_infos[type].max_blocks) { >> + xive_error(xive, "VST: invalid block id %d for" >> + " %s table", blk, vst_infos[type].name); >> + return; >> + } >> + >> + /* >> + * Only take the VC sub-engine configuration into account because >> + * the XiveRouter model combines both VC and PC sub-engines >> + */ >> + if (pc_engine) { >> + return; >> + } >> + >> + if (!vst_addr) { >> + xive_error(xive, "VST: invalid %s table address", vst_infos[t= ype].name); >> + return; >> + } >> + >> + switch (mode) { >> + case VSD_MODE_FORWARD: >> + xive->vsds[type][blk] =3D vsd; >> + break; >> + >> + case VSD_MODE_EXCLUSIVE: >> + pnv_xive_table_set_exclusive(xive, type, blk, vsd); >> + break; >> + >> + default: >> + xive_error(xive, "VST: unsupported table mode %d", mode); >> + return; >> + } >> +} >> + >> +/* >> + * When the TIMA is accessed from the indirect page, the thread id >> + * (PIR) has to be configured in the IC before. This is used for >> + * resets and for debug purpose also. >> + */ >> +static void pnv_xive_thread_indirect_set(PnvXive *xive, uint64_t val) >> +{ >> + int pir =3D GETFIELD(PC_TCTXT_INDIR_THRDID, xive->regs[PC_TCTXT_I= NDIR0 >> 3]); >> + >> + if (val & PC_TCTXT_INDIR_VALID) { >> + if (xive->cpu_ind) { >> + xive_error(xive, "IC: indirect access already set for " >> + "invalid PIR %d", pir); >> + } >> + >> + pir =3D GETFIELD(PC_TCTXT_INDIR_THRDID, val) & 0xff; >> + xive->cpu_ind =3D ppc_get_vcpu_by_pir(pir); >> + if (!xive->cpu_ind) { >> + xive_error(xive, "IC: invalid PIR %d for indirect access"= , pir); >> + } >> + } else { >> + xive->cpu_ind =3D NULL; >> + } >> +} >> + >> +/* >> + * Interrupt Controller registers MMIO >> + */ >> +static void pnv_xive_ic_reg_write(PnvXive *xive, uint32_t offset, uin= t64_t val, >> + bool mmio) >> +{ >> + MemoryRegion *sysmem =3D get_system_memory(); >> + uint32_t reg =3D offset >> 3; >> + >> + switch (offset) { >> + >> + /* >> + * XIVE CQ (PowerBus bridge) settings >> + */ >> + case CQ_MSGSND: /* msgsnd for doorbells */ >> + case CQ_FIRMASK_OR: /* FIR error reporting */ >> + xive->regs[reg] =3D val; >=20 > Can you do that generic update outside the switch? If that leaves too > many special cases that might be a sign you shouldn't use the > big-array-of-regs model. I also use the 'switch' to segment the different registers of the controller depending on the sub-engine. It think the big-array-of-regs=20 model should word. I need to add a couple more helper routines. >> + break; >> + case CQ_PBI_CTL: >> + if (val & CQ_PBI_PC_64K) { >> + xive->pc_shift =3D 16; >> + } >> + if (val & CQ_PBI_VC_64K) { >> + xive->vc_shift =3D 16; >> + } >> + break; >> + case CQ_CFG_PB_GEN: /* PowerBus General Configuration */ >> + /* >> + * TODO: CQ_INT_ADDR_OPT for 1-block-per-chip mode >> + */ >> + xive->regs[reg] =3D val; >> + break; >> + >> + /* >> + * XIVE Virtualization Controller settings >> + */ >> + case VC_GLOBAL_CONFIG: >> + xive->regs[reg] =3D val; >> + break; >> + >> + /* >> + * XIVE Presenter Controller settings >> + */ >> + case PC_GLOBAL_CONFIG: >> + /* Overrides Int command Chip ID with the Chip ID field */ >> + if (val & PC_GCONF_CHIPID_OVR) { >> + xive->chip_id =3D GETFIELD(PC_GCONF_CHIPID, val); >> + } >> + xive->regs[reg] =3D val; >> + break; >> + case PC_TCTXT_CFG: >> + /* >> + * TODO: PC_TCTXT_CFG_BLKGRP_EN for block group support >> + * TODO: PC_TCTXT_CFG_HARD_CHIPID_BLK >> + */ >> + >> + /* >> + * Moves the chipid into block field for hardwired CAM >> + * compares Block offset value is adjusted to 0b0..01 & ThrdI= d >> + */ >> + if (val & PC_TCTXT_CHIPID_OVERRIDE) { >> + xive->thread_chip_id =3D GETFIELD(PC_TCTXT_CHIPID, val); >> + } >> + break; >> + case PC_TCTXT_TRACK: /* Enable block tracking (DD2) */ >> + xive->regs[reg] =3D val; >> + break; >> + >> + /* >> + * Misc settings >> + */ >> + case VC_EQC_CONFIG: /* enable silent escalation */ >> + case VC_SBC_CONFIG: /* Store EOI configuration */ >> + case VC_AIB_TX_ORDER_TAG2: >> + xive->regs[reg] =3D val; >> + break; >> + >> + /* >> + * XIVE BAR settings (XSCOM only) >> + */ >> + case CQ_RST_CTL: >> + /* resets all bars */ >> + break; >> + >> + case CQ_IC_BAR: /* IC BAR. 8 pages */ >> + xive->ic_shift =3D val & CQ_IC_BAR_64K ? 16 : 12; >> + if (!(val & CQ_IC_BAR_VALID)) { >> + xive->ic_base =3D 0; >> + if (xive->regs[reg] & CQ_IC_BAR_VALID) { >> + memory_region_del_subregion(&xive->ic_mmio, >> + &xive->ic_reg_mmio); >> + memory_region_del_subregion(&xive->ic_mmio, >> + &xive->ic_notify_mmio); >> + memory_region_del_subregion(sysmem, &xive->ic_mmio); >> + memory_region_del_subregion(sysmem, &xive->tm_mmio_in= direct); >> + } >> + } else { >> + xive->ic_base =3D val & ~(CQ_IC_BAR_VALID | CQ_IC_BAR_64= K); >> + if (!(xive->regs[reg] & CQ_IC_BAR_VALID)) { >> + memory_region_add_subregion(sysmem, xive->ic_base, >> + &xive->ic_mmio); >> + memory_region_add_subregion(&xive->ic_mmio, 0, >> + &xive->ic_reg_mmio); >> + memory_region_add_subregion(&xive->ic_mmio, >> + 1ul << xive->ic_shift, >> + &xive->ic_notify_mmio); >> + memory_region_add_subregion(sysmem, >> + xive->ic_base + (4ull << xive->ic_= shift), >> + &xive->tm_mmio_indirect); >> + } >> + } >> + xive->regs[reg] =3D val; >> + break; >> + >> + case CQ_TM1_BAR: /* TM BAR and page size. 4 pages */ >> + case CQ_TM2_BAR: /* second TM BAR is for hotplug use */ >> + xive->tm_shift =3D val & CQ_TM_BAR_64K ? 16 : 12; >> + if (!(val & CQ_TM_BAR_VALID)) { >> + xive->tm_base =3D 0; >> + if (xive->regs[reg] & CQ_TM_BAR_VALID) { >> + memory_region_del_subregion(sysmem, &xive->tm_mmio); >> + } >> + } else { >> + xive->tm_base =3D val & ~(CQ_TM_BAR_VALID | CQ_TM_BAR_64= K); >> + if (!(xive->regs[reg] & CQ_TM_BAR_VALID)) { >> + memory_region_add_subregion(sysmem, xive->tm_base, >> + &xive->tm_mmio); >> + } >> + } >> + xive->regs[reg] =3D val; >> + break; >=20 > Something funny with your indentation here. >=20 >> + case CQ_PC_BAR: >> + if (!(val & CQ_PC_BAR_VALID)) { >> + xive->pc_base =3D 0; >> + if (xive->regs[reg] & CQ_PC_BAR_VALID) { >> + memory_region_del_subregion(sysmem, &xive->pc_mmio); >> + } >> + } else { >> + xive->pc_base =3D val & ~(CQ_PC_BAR_VALID); >> + if (!(xive->regs[reg] & CQ_PC_BAR_VALID)) { >> + memory_region_add_subregion(sysmem, xive->pc_base, >> + &xive->pc_mmio); >> + } >> + } >> + xive->regs[reg] =3D val; >> + break; >> + case CQ_PC_BARM: /* TODO: configure PC BAR size at runtime */ >> + xive->pc_size =3D (~val + 1) & CQ_PC_BARM_MASK; >> + xive->regs[reg] =3D val; >> + >> + /* Compute the size of the VDT sets */ >> + xive->vdt_shift =3D ctz64(xive->pc_size / XIVE_XLATE_VDT_MAX)= ; >> + break; >> + >> + case CQ_VC_BAR: /* From 64M to 4TB */ >> + if (!(val & CQ_VC_BAR_VALID)) { >> + xive->vc_base =3D 0; >> + if (xive->regs[reg] & CQ_VC_BAR_VALID) { >> + memory_region_del_subregion(sysmem, &xive->vc_mmio); >> + } >> + } else { >> + xive->vc_base =3D val & ~(CQ_VC_BAR_VALID); >> + if (!(xive->regs[reg] & CQ_VC_BAR_VALID)) { >> + memory_region_add_subregion(sysmem, xive->vc_base, >> + &xive->vc_mmio); >> + } >> + } >> + xive->regs[reg] =3D val; >> + break; >> + case CQ_VC_BARM: /* TODO: configure VC BAR size at runtime */ >> + xive->vc_size =3D (~val + 1) & CQ_VC_BARM_MASK; >=20 > Any reason to precompute that, rather than work it out from > regs[CQ_VC_BARM] when you need it? Apart from having a name making more sense (it matches vc_mmio), no.=20 I can remove these.=20 >> + xive->regs[reg] =3D val; >> + >> + /* Compute the size of the EDT sets */ >> + xive->edt_shift =3D ctz64(xive->vc_size / XIVE_XLATE_EDT_MAX)= ; >> + break; >> + >> + /* >> + * XIVE Set Translation Table settings. Defines the layout of the >> + * VC BAR containing the ESB pages of the IPIs and of the ENDs >> + */ >> + case CQ_TAR: /* Set Translation Table Address */ >> + pnv_xive_set_xlate_select(xive, val); >> + break; >> + case CQ_TDR: /* Set Translation Table Data */ >> + pnv_xive_set_xlate_update(xive, val); >> + break; >> + >> + /* >> + * XIVE VC & PC Virtual Structure Table settings >> + */ >> + case VC_VSD_TABLE_ADDR: >> + case PC_VSD_TABLE_ADDR: /* Virtual table selector */ >> + xive->regs[reg] =3D val; >> + break; >> + case VC_VSD_TABLE_DATA: /* Virtual table setting */ >> + case PC_VSD_TABLE_DATA: >> + pnv_xive_table_set_data(xive, val, offset =3D=3D PC_VSD_TABLE= _DATA); >> + break; >> + >> + /* >> + * Interrupt fifo overflow in memory backing store. Not modeled >> + */ >> + case VC_IRQ_CONFIG_IPI: >> + case VC_IRQ_CONFIG_HW: >> + case VC_IRQ_CONFIG_CASCADE1: >> + case VC_IRQ_CONFIG_CASCADE2: >> + case VC_IRQ_CONFIG_REDIST: >> + case VC_IRQ_CONFIG_IPI_CASC: >> + xive->regs[reg] =3D val; >> + break; >> + >> + /* >> + * XIVE hardware thread enablement >> + */ >> + case PC_THREAD_EN_REG0_SET: /* Physical Thread Enable */ >> + case PC_THREAD_EN_REG1_SET: /* Physical Thread Enable (fused core= ) */ >> + xive->regs[reg] |=3D val; >> + break; >> + case PC_THREAD_EN_REG0_CLR: >> + xive->regs[PC_THREAD_EN_REG0_SET >> 3] &=3D ~val; >> + break; >> + case PC_THREAD_EN_REG1_CLR: >> + xive->regs[PC_THREAD_EN_REG1_SET >> 3] &=3D ~val; >> + break; >> + >> + /* >> + * Indirect TIMA access set up. Defines the HW thread to use. >> + */ >> + case PC_TCTXT_INDIR0: >> + pnv_xive_thread_indirect_set(xive, val); >> + xive->regs[reg] =3D val; >> + break; >> + case PC_TCTXT_INDIR1: >> + case PC_TCTXT_INDIR2: >> + case PC_TCTXT_INDIR3: >> + /* TODO: check what PC_TCTXT_INDIR[123] are for */ >> + xive->regs[reg] =3D val; >> + break; >> + >> + /* >> + * XIVE PC & VC cache updates for EAS, NVT and END >> + */ >> + case PC_VPC_SCRUB_MASK: >> + case PC_VPC_CWATCH_SPEC: >> + case VC_EQC_SCRUB_MASK: >> + case VC_EQC_CWATCH_SPEC: >> + case VC_IVC_SCRUB_MASK: >> + xive->regs[reg] =3D val; >> + break; >> + case VC_IVC_SCRUB_TRIG: >> + pnv_xive_eas_update(xive, GETFIELD(VC_SCRUB_OFFSET, val)); >> + break; >> + case PC_VPC_CWATCH_DAT0: >> + case PC_VPC_CWATCH_DAT1: >> + case PC_VPC_CWATCH_DAT2: >> + case PC_VPC_CWATCH_DAT3: >> + case PC_VPC_CWATCH_DAT4: >> + case PC_VPC_CWATCH_DAT5: >> + case PC_VPC_CWATCH_DAT6: >> + case PC_VPC_CWATCH_DAT7: >> + xive->vpc_watch[(offset - PC_VPC_CWATCH_DAT0) / 8] =3D cpu_to= _be64(val); >> + break; >> + case PC_VPC_SCRUB_TRIG: >> + pnv_xive_nvt_update(xive, GETFIELD(PC_SCRUB_BLOCK_ID, val), >> + GETFIELD(PC_SCRUB_OFFSET, val)); >> + break; >> + case VC_EQC_CWATCH_DAT0: >> + case VC_EQC_CWATCH_DAT1: >> + case VC_EQC_CWATCH_DAT2: >> + case VC_EQC_CWATCH_DAT3: >> + xive->eqc_watch[(offset - VC_EQC_CWATCH_DAT0) / 8] =3D cpu_to= _be64(val); >> + break; >> + case VC_EQC_SCRUB_TRIG: >> + pnv_xive_end_update(xive, GETFIELD(VC_SCRUB_BLOCK_ID, val), >> + GETFIELD(VC_SCRUB_OFFSET, val)); >> + break; >> + >> + /* >> + * XIVE PC & VC cache invalidation >> + */ >> + case PC_AT_KILL: >> + xive->regs[reg] |=3D val; >> + break; >> + case VC_AT_MACRO_KILL: >> + xive->regs[reg] |=3D val; >> + break; >> + case PC_AT_KILL_MASK: >> + case VC_AT_MACRO_KILL_MASK: >> + xive->regs[reg] =3D val; >> + break; >> + >> + default: >> + xive_error(xive, "IC: invalid write to reg=3D0x%08x mmio=3D%d= ", offset, >> + mmio); >> + } >> +} >> + >> +static uint64_t pnv_xive_ic_reg_read(PnvXive *xive, uint32_t offset, = bool mmio) >> +{ >> + uint64_t val =3D 0; >> + uint32_t reg =3D offset >> 3; >> + >> + switch (offset) { >> + case CQ_CFG_PB_GEN: >> + case CQ_IC_BAR: >> + case CQ_TM1_BAR: >> + case CQ_TM2_BAR: >> + case CQ_PC_BAR: >> + case CQ_PC_BARM: >> + case CQ_VC_BAR: >> + case CQ_VC_BARM: >> + case CQ_TAR: >> + case CQ_TDR: >> + case CQ_PBI_CTL: >> + >> + case PC_TCTXT_CFG: >> + case PC_TCTXT_TRACK: >> + case PC_TCTXT_INDIR0: >> + case PC_TCTXT_INDIR1: >> + case PC_TCTXT_INDIR2: >> + case PC_TCTXT_INDIR3: >> + case PC_GLOBAL_CONFIG: >> + >> + case PC_VPC_SCRUB_MASK: >> + case PC_VPC_CWATCH_SPEC: >> + case PC_VPC_CWATCH_DAT0: >> + case PC_VPC_CWATCH_DAT1: >> + case PC_VPC_CWATCH_DAT2: >> + case PC_VPC_CWATCH_DAT3: >> + case PC_VPC_CWATCH_DAT4: >> + case PC_VPC_CWATCH_DAT5: >> + case PC_VPC_CWATCH_DAT6: >> + case PC_VPC_CWATCH_DAT7: >> + >> + case VC_GLOBAL_CONFIG: >> + case VC_AIB_TX_ORDER_TAG2: >> + >> + case VC_IRQ_CONFIG_IPI: >> + case VC_IRQ_CONFIG_HW: >> + case VC_IRQ_CONFIG_CASCADE1: >> + case VC_IRQ_CONFIG_CASCADE2: >> + case VC_IRQ_CONFIG_REDIST: >> + case VC_IRQ_CONFIG_IPI_CASC: >> + >> + case VC_EQC_SCRUB_MASK: >> + case VC_EQC_CWATCH_DAT0: >> + case VC_EQC_CWATCH_DAT1: >> + case VC_EQC_CWATCH_DAT2: >> + case VC_EQC_CWATCH_DAT3: >> + >> + case VC_EQC_CWATCH_SPEC: >> + case VC_IVC_SCRUB_MASK: >> + case VC_SBC_CONFIG: >> + case VC_AT_MACRO_KILL_MASK: >> + case VC_VSD_TABLE_ADDR: >> + case PC_VSD_TABLE_ADDR: >> + case VC_VSD_TABLE_DATA: >> + case PC_VSD_TABLE_DATA: >> + val =3D xive->regs[reg]; >> + break; >> + >> + case CQ_MSGSND: /* Identifies which cores have msgsnd enabled. >> + * Say all have. */ >> + val =3D 0xffffff0000000000; >> + break; >> + >> + /* >> + * XIVE PC & VC cache updates for EAS, NVT and END >> + */ >> + case PC_VPC_SCRUB_TRIG: >> + case VC_IVC_SCRUB_TRIG: >> + case VC_EQC_SCRUB_TRIG: >> + xive->regs[reg] &=3D ~VC_SCRUB_VALID; >> + val =3D xive->regs[reg]; >> + break; >> + >> + /* >> + * XIVE PC & VC cache invalidation >> + */ >> + case PC_AT_KILL: >> + xive->regs[reg] &=3D ~PC_AT_KILL_VALID; >> + val =3D xive->regs[reg]; >> + break; >> + case VC_AT_MACRO_KILL: >> + xive->regs[reg] &=3D ~VC_KILL_VALID; >> + val =3D xive->regs[reg]; >> + break; >> + >> + /* >> + * XIVE synchronisation >> + */ >> + case VC_EQC_CONFIG: >> + val =3D VC_EQC_SYNC_MASK; >> + break; >> + >> + default: >> + xive_error(xive, "IC: invalid read reg=3D0x%08x mmio=3D%d", o= ffset, mmio); >> + } >> + >> + return val; >> +} >> + >> +static void pnv_xive_ic_reg_write_mmio(void *opaque, hwaddr addr, >> + uint64_t val, unsigned size) >> +{ >> + pnv_xive_ic_reg_write(opaque, addr, val, true); >=20 > AFAICT the underlaying write function never uses that 'mmio' parameter > except for debug, so it's probably not worth the bother of having > these wrappers. OK. I will check. >> +} >> + >> +static uint64_t pnv_xive_ic_reg_read_mmio(void *opaque, hwaddr addr, >> + unsigned size) >> +{ >> + return pnv_xive_ic_reg_read(opaque, addr, true); >> +} >> + >> +static const MemoryRegionOps pnv_xive_ic_reg_ops =3D { >> + .read =3D pnv_xive_ic_reg_read_mmio, >> + .write =3D pnv_xive_ic_reg_write_mmio, >> + .endianness =3D DEVICE_BIG_ENDIAN, >> + .valid =3D { >> + .min_access_size =3D 8, >> + .max_access_size =3D 8, >> + }, >> + .impl =3D { >> + .min_access_size =3D 8, >> + .max_access_size =3D 8, >> + }, >> +}; >> + >> +/* >> + * Interrupt Controller MMIO: Notify port page (write only) >> + */ >> +#define PNV_XIVE_FORWARD_IPI 0x800 /* Forward IPI */ >> +#define PNV_XIVE_FORWARD_HW 0x880 /* Forward HW */ >> +#define PNV_XIVE_FORWARD_OS_ESC 0x900 /* Forward OS escalation */ >> +#define PNV_XIVE_FORWARD_HW_ESC 0x980 /* Forward Hyp escalation *= / >> +#define PNV_XIVE_FORWARD_REDIS 0xa00 /* Forward Redistribution *= / >> +#define PNV_XIVE_RESERVED5 0xa80 /* Cache line 5 PowerBUS op= eration */ >> +#define PNV_XIVE_RESERVED6 0xb00 /* Cache line 6 PowerBUS op= eration */ >> +#define PNV_XIVE_RESERVED7 0xb80 /* Cache line 7 PowerBUS op= eration */ >> + >> +/* VC synchronisation */ >> +#define PNV_XIVE_SYNC_IPI 0xc00 /* Sync IPI */ >> +#define PNV_XIVE_SYNC_HW 0xc80 /* Sync HW */ >> +#define PNV_XIVE_SYNC_OS_ESC 0xd00 /* Sync OS escalation */ >> +#define PNV_XIVE_SYNC_HW_ESC 0xd80 /* Sync Hyp escalation */ >> +#define PNV_XIVE_SYNC_REDIS 0xe00 /* Sync Redistribution */ >> + >> +/* PC synchronisation */ >> +#define PNV_XIVE_SYNC_PULL 0xe80 /* Sync pull context */ >> +#define PNV_XIVE_SYNC_PUSH 0xf00 /* Sync push context */ >> +#define PNV_XIVE_SYNC_VPC 0xf80 /* Sync remove VPC store */ >> + >> +static void pnv_xive_ic_hw_trigger(PnvXive *xive, hwaddr addr, uint64= _t val) >> +{ >> + XiveFabricClass *xfc =3D XIVE_FABRIC_GET_CLASS(xive); >> + >> + xfc->notify(XIVE_FABRIC(xive), val); >> +} >> + >> +static void pnv_xive_ic_notify_write(void *opaque, hwaddr addr, uint6= 4_t val, >> + unsigned size) >> +{ >> + PnvXive *xive =3D PNV_XIVE(opaque); >> + >> + /* VC: HW triggers */ >> + switch (addr) { >> + case 0x000 ... 0x7FF: >> + pnv_xive_ic_hw_trigger(opaque, addr, val); >> + break; >> + >> + /* VC: Forwarded IRQs */ >> + case PNV_XIVE_FORWARD_IPI: >> + case PNV_XIVE_FORWARD_HW: >> + case PNV_XIVE_FORWARD_OS_ESC: >> + case PNV_XIVE_FORWARD_HW_ESC: >> + case PNV_XIVE_FORWARD_REDIS: >> + /* TODO: forwarded IRQs. Should be like HW triggers */ >> + xive_error(xive, "IC: forwarded at @0x%"HWADDR_PRIx" IRQ 0x%"= PRIx64, >> + addr, val); >> + break; >> + >> + /* VC syncs */ >> + case PNV_XIVE_SYNC_IPI: >> + case PNV_XIVE_SYNC_HW: >> + case PNV_XIVE_SYNC_OS_ESC: >> + case PNV_XIVE_SYNC_HW_ESC: >> + case PNV_XIVE_SYNC_REDIS: >> + break; >> + >> + /* PC sync */ >> + case PNV_XIVE_SYNC_PULL: >> + case PNV_XIVE_SYNC_PUSH: >> + case PNV_XIVE_SYNC_VPC: >> + break; >> + >> + default: >> + xive_error(xive, "IC: invalid notify write @%"HWADDR_PRIx, ad= dr); >> + } >> +} >> + >> +static uint64_t pnv_xive_ic_notify_read(void *opaque, hwaddr addr, >> + unsigned size) >> +{ >> + PnvXive *xive =3D PNV_XIVE(opaque); >> + >> + /* loads are invalid */ >> + xive_error(xive, "IC: invalid notify read @%"HWADDR_PRIx, addr); >> + return -1; >> +} >> + >> +static const MemoryRegionOps pnv_xive_ic_notify_ops =3D { >> + .read =3D pnv_xive_ic_notify_read, >> + .write =3D pnv_xive_ic_notify_write, >> + .endianness =3D DEVICE_BIG_ENDIAN, >> + .valid =3D { >> + .min_access_size =3D 8, >> + .max_access_size =3D 8, >> + }, >> + .impl =3D { >> + .min_access_size =3D 8, >> + .max_access_size =3D 8, >> + }, >> +}; >> + >> +/* >> + * Interrupt controller MMIO region. The layout is compatible between >> + * 4K and 64K pages : >> + * >> + * Page 0 sub-engine BARs >> + * 0x000 - 0x3FF IC registers >> + * 0x400 - 0x7FF PC registers >> + * 0x800 - 0xFFF VC registers >> + * >> + * Page 1 Notify page >> + * 0x000 - 0x7FF HW interrupt triggers (PSI, PHB) >> + * 0x800 - 0xFFF forwards and syncs >> + * >> + * Page 2 LSI Trigger page (writes only) (not modeled) >> + * Page 3 LSI SB EOI page (reads only) (not modeled) >> + * >> + * Page 4-7 indirect TIMA (aliased to TIMA region) >> + */ >> +static void pnv_xive_ic_write(void *opaque, hwaddr addr, >> + uint64_t val, unsigned size) >> +{ >> + PnvXive *xive =3D PNV_XIVE(opaque); >> + >> + xive_error(xive, "IC: invalid write @%"HWADDR_PRIx, addr); >> +} >> + >> +static uint64_t pnv_xive_ic_read(void *opaque, hwaddr addr, unsigned = size) >> +{ >> + PnvXive *xive =3D PNV_XIVE(opaque); >> + >> + xive_error(xive, "IC: invalid read @%"HWADDR_PRIx, addr); >> + return -1; >> +} >> + >> +static const MemoryRegionOps pnv_xive_ic_ops =3D { >> + .read =3D pnv_xive_ic_read, >> + .write =3D pnv_xive_ic_write, >=20 > Erm.. it's not clear to me what this achieves, since the read/write > accessors just error every time. These are the ops for the main IC MMIO region (8 pages) which contains=20 the subregions ic_reg_mmio and ic_notify_mmio, 1 page each. pages 2-3=20 are not implemented and pages 4-7 are so we have a hole. >> + .endianness =3D DEVICE_BIG_ENDIAN, >> + .valid =3D { >> + .min_access_size =3D 8, >> + .max_access_size =3D 8, >> + }, >> + .impl =3D { >> + .min_access_size =3D 8, >> + .max_access_size =3D 8, >> + }, >> +}; >> + >> +/* >> + * Interrupt controller XSCOM region. Load accesses are nearly all >> + * done all through the MMIO region. >> + */ >> +static uint64_t pnv_xive_xscom_read(void *opaque, hwaddr addr, unsign= ed size) >> +{ >> + PnvXive *xive =3D PNV_XIVE(opaque); >> + >> + switch (addr >> 3) { >> + case X_VC_EQC_CONFIG: >> + /* >> + * This is the only XSCOM load done in skiboot. Bizarre. To b= e >> + * checked. >> + */ >> + return VC_EQC_SYNC_MASK; >> + default: >> + return pnv_xive_ic_reg_read(xive, addr, false); >> + } >> +} >> + >> +static void pnv_xive_xscom_write(void *opaque, hwaddr addr, >> + uint64_t val, unsigned size) >> +{ >> + pnv_xive_ic_reg_write(opaque, addr, val, false); >> +} >> + >> +static const MemoryRegionOps pnv_xive_xscom_ops =3D { >> + .read =3D pnv_xive_xscom_read, >> + .write =3D pnv_xive_xscom_write, >> + .endianness =3D DEVICE_BIG_ENDIAN, >> + .valid =3D { >> + .min_access_size =3D 8, >> + .max_access_size =3D 8, >> + }, >> + .impl =3D { >> + .min_access_size =3D 8, >> + .max_access_size =3D 8, >> + } >> +}; >> + >> +/* >> + * Virtualization Controller MMIO region containing the IPI and END E= SB pages >> + */ >> +static uint64_t pnv_xive_vc_read(void *opaque, hwaddr offset, >> + unsigned size) >> +{ >> + PnvXive *xive =3D PNV_XIVE(opaque); >> + uint64_t edt_index =3D offset >> xive->edt_shift; >> + uint64_t edt_type =3D 0; >> + uint64_t ret =3D -1; >> + uint64_t edt_offset; >> + MemTxResult result; >> + AddressSpace *edt_as =3D NULL; >> + >> + if (edt_index < XIVE_XLATE_EDT_MAX) { >> + edt_type =3D GETFIELD(CQ_TDR_EDT_TYPE, xive->set_xlate_edt[ed= t_index]); >> + } >> + >> + switch (edt_type) { >> + case CQ_TDR_EDT_IPI: >> + edt_as =3D &xive->ipi_as; >> + break; >> + case CQ_TDR_EDT_EQ: >> + edt_as =3D &xive->end_as; >> + break; >> + default: >> + xive_error(xive, "VC: invalid read @%"HWADDR_PRIx, offset); >> + return -1; >> + } >> + >> + /* remap the offset for the targeted address space */ >> + edt_offset =3D pnv_xive_set_xlate_edt_offset(xive, offset, edt_ty= pe); >> + >> + ret =3D address_space_ldq(edt_as, edt_offset, MEMTXATTRS_UNSPECIF= IED, >> + &result); >=20 > I think there needs to be a byteswap here somewhere. This is loading > a value from a BE table, AFAICT... >=20 >> + if (result !=3D MEMTX_OK) { >> + xive_error(xive, "VC: %s read failed at @0x%"HWADDR_PRIx " ->= @0x%" >> + HWADDR_PRIx, edt_type =3D=3D CQ_TDR_EDT_IPI ? "IPI= " : "END", >> + offset, edt_offset); >> + return -1; >> + } >=20 > ... but these helpers are expected to return host-native values. hmm, yes.=20 This works today because these address spaces are backed by the ESB pages= =20 of the XiveSource and the XiveENDSource for which the data can be ignored= . >> + return ret; >> +} >> + >> +static void pnv_xive_vc_write(void *opaque, hwaddr offset, >> + uint64_t val, unsigned size) >> +{ >> + PnvXive *xive =3D PNV_XIVE(opaque); >> + uint64_t edt_index =3D offset >> xive->edt_shift; >> + uint64_t edt_type =3D 0; >> + uint64_t edt_offset; >> + MemTxResult result; >> + AddressSpace *edt_as =3D NULL; >> + >> + if (edt_index < XIVE_XLATE_EDT_MAX) { >> + edt_type =3D GETFIELD(CQ_TDR_EDT_TYPE, xive->set_xlate_edt[ed= t_index]); >> + } >> + >> + switch (edt_type) { >> + case CQ_TDR_EDT_IPI: >> + edt_as =3D &xive->ipi_as; >> + break; >> + case CQ_TDR_EDT_EQ: >> + edt_as =3D &xive->end_as; >> + break; >> + default: >> + xive_error(xive, "VC: invalid read @%"HWADDR_PRIx, offset); >> + return; >> + } >> + >> + /* remap the offset for the targeted address space */ >> + edt_offset =3D pnv_xive_set_xlate_edt_offset(xive, offset, edt_ty= pe); >> + >> + address_space_stq(edt_as, edt_offset, val, MEMTXATTRS_UNSPECIFIED= , &result); >> + if (result !=3D MEMTX_OK) { >> + xive_error(xive, "VC: write failed at @0x%"HWADDR_PRIx, edt_o= ffset); >> + } >> +} >> + >> +static const MemoryRegionOps pnv_xive_vc_ops =3D { >> + .read =3D pnv_xive_vc_read, >> + .write =3D pnv_xive_vc_write, >> + .endianness =3D DEVICE_BIG_ENDIAN, >> + .valid =3D { >> + .min_access_size =3D 8, >> + .max_access_size =3D 8, >> + }, >> + .impl =3D { >> + .min_access_size =3D 8, >> + .max_access_size =3D 8, >> + }, >> +}; >> + >> +/* >> + * Presenter Controller MMIO region. This is used by the Virtualizati= on >> + * Controller to update the IPB in the NVT table when required. Not >> + * implemented. >> + */ >> +static uint64_t pnv_xive_pc_read(void *opaque, hwaddr addr, >> + unsigned size) >> +{ >> + PnvXive *xive =3D PNV_XIVE(opaque); >> + >> + xive_error(xive, "PC: invalid read @%"HWADDR_PRIx, addr); >> + return -1; >> +} >> + >> +static void pnv_xive_pc_write(void *opaque, hwaddr addr, >> + uint64_t value, unsigned size) >> +{ >> + PnvXive *xive =3D PNV_XIVE(opaque); >> + >> + xive_error(xive, "PC: invalid write to VC @%"HWADDR_PRIx, addr); >> +} >> + >> +static const MemoryRegionOps pnv_xive_pc_ops =3D { >> + .read =3D pnv_xive_pc_read, >> + .write =3D pnv_xive_pc_write, >> + .endianness =3D DEVICE_BIG_ENDIAN, >> + .valid =3D { >> + .min_access_size =3D 1, >> + .max_access_size =3D 8, >> + }, >> + .impl =3D { >> + .min_access_size =3D 1, >> + .max_access_size =3D 8, >> + }, >> +}; >> + >> +void pnv_xive_pic_print_info(PnvXive *xive, Monitor *mon) >> +{ >> + XiveRouter *xrtr =3D XIVE_ROUTER(xive); >> + XiveEAS eas; >> + XiveEND end; >> + uint32_t endno =3D 0; >> + uint32_t srcno0 =3D XIVE_SRCNO(xive->chip_id, 0); >> + uint32_t srcno =3D srcno0; >> + >> + monitor_printf(mon, "XIVE[%x] Source %08x .. %08x\n", xive->chip_= id, >> + srcno0, srcno0 + xive->source.nr_irqs - 1); >> + xive_source_pic_print_info(&xive->source, srcno0, mon); >> + >> + monitor_printf(mon, "XIVE[%x] EAT %08x .. %08x\n", xive->chip_id, >> + srcno0, srcno0 + xive->nr_irqs - 1); >> + while (!xive_router_get_eas(xrtr, srcno, &eas)) { >> + if (!(eas.w & EAS_MASKED)) { >> + xive_eas_pic_print_info(&eas, srcno, mon); >> + } >> + srcno++; >> + } >> + >> + monitor_printf(mon, "XIVE[%x] ENDT %08x .. %08x\n", xive->chip_id= , >> + 0, xive->nr_ends - 1); >> + while (!xive_router_get_end(xrtr, xrtr->chip_id, endno, &end)) { >> + xive_end_pic_print_info(&end, endno++, mon); >> + } >> +} >> + >> +static void pnv_xive_reset(DeviceState *dev) >> +{ >> + PnvXive *xive =3D PNV_XIVE(dev); >> + PnvChip *chip =3D PNV_CHIP(object_property_get_link(OBJECT(dev), = "chip", >> + &error_fatal)); >> + >> + /* >> + * Use the chip id to identify the XIVE interrupt controller. It >> + * can be overriden by configuration at runtime. >> + */ >> + xive->chip_id =3D xive->thread_chip_id =3D chip->chip_id; >=20 > You shouldn't need to touch this at reset, only at init/realize. yes apart from the thread_chip_id. >=20 >> + /* Default page size. Should be changed at runtime to 64k */ >> + xive->ic_shift =3D xive->vc_shift =3D xive->pc_shift =3D 12; >> + >> + /* >> + * PowerNV XIVE sources are realized at runtime when the set >> + * translation tables are configured. >=20 > Yeah.. that seems unlikely to be a good idea. I can try to allocate the maximum IRQ Number space for the XiveSource.=20 I don't how to size the XiveENDSource though. Hmm, or may be I should=20 just map portions of IPI MMIO regions and the END MMIO regions depending=20 on the configuration of the firmware.=20 >> + */ >> + if (DEVICE(&xive->source)->realized) { >> + object_property_set_bool(OBJECT(&xive->source), false, "reali= zed", >> + &error_fatal); >> + } >> + >> + if (DEVICE(&xive->end_source)->realized) { >> + object_property_set_bool(OBJECT(&xive->end_source), false, "r= ealized", >> + &error_fatal); >> + } >> +} >> + >> +/* >> + * The VC sub-engine incorporates a source controller for the IPIs. >> + * When triggered, we need to construct a source number with the >> + * chip/block identifier >> + */ >> +static void pnv_xive_notify(XiveFabric *xf, uint32_t srcno) We won't need this routine anymore in the version of the model you have merged. The decoding of the interrupt number is handled by the XiveRouter now. >> +{ >> + PnvXive *xive =3D PNV_XIVE(xf); >> + >> + xive_router_notify(xf, XIVE_SRCNO(xive->chip_id, srcno)); >> +} >> + >> +static void pnv_xive_init(Object *obj) >> +{ >> + PnvXive *xive =3D PNV_XIVE(obj); >> + >> + object_initialize(&xive->source, sizeof(xive->source), TYPE_XIVE_= SOURCE); >> + object_property_add_child(obj, "source", OBJECT(&xive->source), N= ULL); >> + >> + object_initialize(&xive->end_source, sizeof(xive->end_source), >> + TYPE_XIVE_END_SOURCE); >> + object_property_add_child(obj, "end_source", OBJECT(&xive->end_so= urce), >> + NULL); >> +} >> + >> +static void pnv_xive_realize(DeviceState *dev, Error **errp) >> +{ >> + PnvXive *xive =3D PNV_XIVE(dev); >> + >> + /* Default page size. Generally changed at runtime to 64k */ >> + xive->ic_shift =3D xive->vc_shift =3D xive->pc_shift =3D 12; >> + >> + /* XSCOM region, used for initial configuration of the BARs */ >> + memory_region_init_io(&xive->xscom_regs, OBJECT(dev), &pnv_xive_x= scom_ops, >> + xive, "xscom-xive", PNV9_XSCOM_XIVE_SIZE <<= 3); >> + >> + /* Interrupt controller MMIO region */ >> + memory_region_init_io(&xive->ic_mmio, OBJECT(dev), &pnv_xive_ic_o= ps, xive, >> + "xive.ic", PNV9_XIVE_IC_SIZE); >> + memory_region_init_io(&xive->ic_reg_mmio, OBJECT(dev), &pnv_xive_= ic_reg_ops, >> + xive, "xive.ic.reg", 1 << xive->ic_shift); >> + memory_region_init_io(&xive->ic_notify_mmio, OBJECT(dev), >> + &pnv_xive_ic_notify_ops, >> + xive, "xive.ic.notify", 1 << xive->ic_shift= ); >> + >> + /* The Pervasive LSI trigger and EOI pages are not modeled */ >> + >> + /* >> + * Overall Virtualization Controller MMIO region containing the >> + * IPI ESB pages and END ESB pages. The layout is defined by the >> + * EDT set translation table and the accesses are dispatched usin= g >> + * address spaces for each. >> + */ >> + memory_region_init_io(&xive->vc_mmio, OBJECT(xive), &pnv_xive_vc_= ops, xive, >> + "xive.vc", PNV9_XIVE_VC_SIZE); >> + >> + memory_region_init(&xive->ipi_mmio, OBJECT(xive), "xive.vc.ipi", >> + PNV9_XIVE_VC_SIZE); >> + address_space_init(&xive->ipi_as, &xive->ipi_mmio, "xive.vc.ipi")= ; >> + memory_region_init(&xive->end_mmio, OBJECT(xive), "xive.vc.end", >> + PNV9_XIVE_VC_SIZE); >> + address_space_init(&xive->end_as, &xive->end_mmio, "xive.vc.end")= ; >> + >> + >> + /* Presenter Controller MMIO region (not implemented) */ >> + memory_region_init_io(&xive->pc_mmio, OBJECT(xive), &pnv_xive_pc_= ops, xive, >> + "xive.pc", PNV9_XIVE_PC_SIZE); >> + >> + /* Thread Interrupt Management Area, direct an indirect */ >> + memory_region_init_io(&xive->tm_mmio, OBJECT(xive), &xive_tm_ops, >> + &xive->cpu_ind, "xive.tima", PNV9_XIVE_TM_S= IZE); >> + memory_region_init_alias(&xive->tm_mmio_indirect, OBJECT(xive), >> + "xive.tima.indirect", >> + &xive->tm_mmio, 0, PNV9_XIVE_TM_SIZE); >=20 > I'm not quite sure how aliasing to the TIMA can work. AIUI the TIMA > via it's normal access magically tests the requesting CPU to work out > which TCTX it should manipulate. Isn't the idea of of the indirect > access to access some other thread's TIMA for debugging, in which case > you need to override that thread id somehow. Yes. Indirect TIMA accesses need to be discussed. check the pnv_xive_thread_indirect_set() routine in the PnvXive controlle= r=20 and the associated changes in xive_tm_write() and xive_tm_read() of the=20 TIMA, which are below. Thanks, C. >> +} >> + >> +static int pnv_xive_dt_xscom(PnvXScomInterface *dev, void *fdt, >> + int xscom_offset) >> +{ >> + const char compat[] =3D "ibm,power9-xive-x"; >> + char *name; >> + int offset; >> + uint32_t lpc_pcba =3D PNV9_XSCOM_XIVE_BASE; >> + uint32_t reg[] =3D { >> + cpu_to_be32(lpc_pcba), >> + cpu_to_be32(PNV9_XSCOM_XIVE_SIZE) >> + }; >> + >> + name =3D g_strdup_printf("xive@%x", lpc_pcba); >> + offset =3D fdt_add_subnode(fdt, xscom_offset, name); >> + _FDT(offset); >> + g_free(name); >> + >> + _FDT((fdt_setprop(fdt, offset, "reg", reg, sizeof(reg)))); >> + _FDT((fdt_setprop(fdt, offset, "compatible", compat, >> + sizeof(compat)))); >> + return 0; >> +} >> + >> +static Property pnv_xive_properties[] =3D { >> + DEFINE_PROP_UINT64("ic-bar", PnvXive, ic_base, 0), >> + DEFINE_PROP_UINT64("vc-bar", PnvXive, vc_base, 0), >> + DEFINE_PROP_UINT64("pc-bar", PnvXive, pc_base, 0), >> + DEFINE_PROP_UINT64("tm-bar", PnvXive, tm_base, 0), >> + DEFINE_PROP_END_OF_LIST(), >> +}; >> + >> +static void pnv_xive_class_init(ObjectClass *klass, void *data) >> +{ >> + DeviceClass *dc =3D DEVICE_CLASS(klass); >> + PnvXScomInterfaceClass *xdc =3D PNV_XSCOM_INTERFACE_CLASS(klass); >> + XiveRouterClass *xrc =3D XIVE_ROUTER_CLASS(klass); >> + XiveFabricClass *xfc =3D XIVE_FABRIC_CLASS(klass); >> + >> + xdc->dt_xscom =3D pnv_xive_dt_xscom; >> + >> + dc->desc =3D "PowerNV XIVE Interrupt Controller"; >> + dc->realize =3D pnv_xive_realize; >> + dc->props =3D pnv_xive_properties; >> + dc->reset =3D pnv_xive_reset; >> + >> + xrc->get_eas =3D pnv_xive_get_eas; >> + xrc->set_eas =3D pnv_xive_set_eas; >> + xrc->get_end =3D pnv_xive_get_end; >> + xrc->set_end =3D pnv_xive_set_end; >> + xrc->get_nvt =3D pnv_xive_get_nvt; >> + xrc->set_nvt =3D pnv_xive_set_nvt; >> + >> + xfc->notify =3D pnv_xive_notify; >> +}; >> + >> +static const TypeInfo pnv_xive_info =3D { >> + .name =3D TYPE_PNV_XIVE, >> + .parent =3D TYPE_XIVE_ROUTER, >> + .instance_init =3D pnv_xive_init, >> + .instance_size =3D sizeof(PnvXive), >> + .class_init =3D pnv_xive_class_init, >> + .interfaces =3D (InterfaceInfo[]) { >> + { TYPE_PNV_XSCOM_INTERFACE }, >> + { } >> + } >> +}; >> + >> +static void pnv_xive_register_types(void) >> +{ >> + type_register_static(&pnv_xive_info); >> +} >> + >> +type_init(pnv_xive_register_types) >> diff --git a/hw/intc/xive.c b/hw/intc/xive.c >> index c9aedecc8216..9925c90481ae 100644 >> --- a/hw/intc/xive.c >> +++ b/hw/intc/xive.c >> @@ -51,6 +51,8 @@ static uint8_t exception_mask(uint8_t ring) >> switch (ring) { >> case TM_QW1_OS: >> return TM_QW1_NSR_EO; >> + case TM_QW3_HV_PHYS: >> + return TM_QW3_NSR_HE; >> default: >> g_assert_not_reached(); >> } >> @@ -85,7 +87,17 @@ static void xive_tctx_notify(XiveTCTX *tctx, uint8_= t ring) >> uint8_t *regs =3D &tctx->regs[ring]; >> =20 >> if (regs[TM_PIPR] < regs[TM_CPPR]) { >> - regs[TM_NSR] |=3D exception_mask(ring); >> + switch (ring) { >> + case TM_QW1_OS: >> + regs[TM_NSR] |=3D TM_QW1_NSR_EO; >> + break; >> + case TM_QW3_HV_PHYS: >> + regs[TM_NSR] |=3D SETFIELD(TM_QW3_NSR_HE, regs[TM_NSR], >> + TM_QW3_NSR_HE_PHYS); >> + break; >> + default: >> + g_assert_not_reached(); >> + } >> qemu_irq_raise(tctx->output); >> } >> } >> @@ -116,6 +128,38 @@ static void xive_tctx_set_cppr(XiveTCTX *tctx, ui= nt8_t ring, uint8_t cppr) >> #define XIVE_TM_OS_PAGE 0x2 >> #define XIVE_TM_USER_PAGE 0x3 >> =20 >> +static void xive_tm_set_hv_cppr(XiveTCTX *tctx, hwaddr offset, >> + uint64_t value, unsigned size) >> +{ >> + xive_tctx_set_cppr(tctx, TM_QW3_HV_PHYS, value & 0xff); >> +} >> + >> +static uint64_t xive_tm_ack_hv_reg(XiveTCTX *tctx, hwaddr offset, uns= igned size) >> +{ >> + return xive_tctx_accept(tctx, TM_QW3_HV_PHYS); >> +} >> + >> +static uint64_t xive_tm_pull_pool_ctx(XiveTCTX *tctx, hwaddr offset, >> + unsigned size) >> +{ >> + uint64_t ret; >> + >> + ret =3D tctx->regs[TM_QW2_HV_POOL + TM_WORD2] & TM_QW2W2_POOL_CAM= ; >> + tctx->regs[TM_QW2_HV_POOL + TM_WORD2] &=3D ~TM_QW2W2_POOL_CAM; >> + return ret; >> +} >> + >> +static void xive_tm_vt_push(XiveTCTX *tctx, hwaddr offset, >> + uint64_t value, unsigned size) >> +{ >> + tctx->regs[TM_QW3_HV_PHYS + TM_WORD2] =3D value & 0xff; >> +} >> + >> +static uint64_t xive_tm_vt_poll(XiveTCTX *tctx, hwaddr offset, unsign= ed size) >> +{ >> + return tctx->regs[TM_QW3_HV_PHYS + TM_WORD2] & 0xff; >> +} >> + >> /* >> * Define an access map for each page of the TIMA that we will use in >> * the memory region ops to filter values when doing loads and stores >> @@ -295,10 +339,16 @@ static const XiveTmOp xive_tm_operations[] =3D { >> * effects >> */ >> { XIVE_TM_OS_PAGE, TM_QW1_OS + TM_CPPR, 1, xive_tm_set_os_cppr,= NULL }, >> + { XIVE_TM_HV_PAGE, TM_QW3_HV_PHYS + TM_CPPR, 1, xive_tm_set_hv_cp= pr, NULL }, >> + { XIVE_TM_HV_PAGE, TM_QW3_HV_PHYS + TM_WORD2, 1, xive_tm_vt_push,= NULL }, >> + { XIVE_TM_HV_PAGE, TM_QW3_HV_PHYS + TM_WORD2, 1, NULL, xive_tm_vt= _poll }, >> =20 >> /* MMIOs above 2K : special operations with side effects */ >> { XIVE_TM_OS_PAGE, TM_SPC_ACK_OS_REG, 2, NULL, xive_tm_ack_os= _reg }, >> { XIVE_TM_OS_PAGE, TM_SPC_SET_OS_PENDING, 1, xive_tm_set_os_pendi= ng, NULL }, >> + { XIVE_TM_HV_PAGE, TM_SPC_ACK_HV_REG, 2, NULL, xive_tm_ack_hv= _reg }, >> + { XIVE_TM_HV_PAGE, TM_SPC_PULL_POOL_CTX, 4, NULL, xive_tm_pull_p= ool_ctx }, >> + { XIVE_TM_HV_PAGE, TM_SPC_PULL_POOL_CTX, 8, NULL, xive_tm_pull_p= ool_ctx }, >> }; >> =20 >> static const XiveTmOp *xive_tm_find_op(hwaddr offset, unsigned size, = bool write) >> @@ -327,7 +377,8 @@ static const XiveTmOp *xive_tm_find_op(hwaddr offs= et, unsigned size, bool write) >> static void xive_tm_write(void *opaque, hwaddr offset, >> uint64_t value, unsigned size) >> { >> - PowerPCCPU *cpu =3D POWERPC_CPU(current_cpu); >> + PowerPCCPU **cpuptr =3D opaque; >> + PowerPCCPU *cpu =3D *cpuptr ? *cpuptr : POWERPC_CPU(current_cpu); >> XiveTCTX *tctx =3D XIVE_TCTX(cpu->intc); >> const XiveTmOp *xto; >> =20 >> @@ -366,7 +417,8 @@ static void xive_tm_write(void *opaque, hwaddr off= set, >> =20 >> static uint64_t xive_tm_read(void *opaque, hwaddr offset, unsigned si= ze) >> { >> - PowerPCCPU *cpu =3D POWERPC_CPU(current_cpu); >> + PowerPCCPU **cpuptr =3D opaque; >> + PowerPCCPU *cpu =3D *cpuptr ? *cpuptr : POWERPC_CPU(current_cpu); >> XiveTCTX *tctx =3D XIVE_TCTX(cpu->intc); >> const XiveTmOp *xto; >> =20 >> @@ -501,6 +553,9 @@ static void xive_tctx_base_reset(void *dev) >> */ >> tctx->regs[TM_QW1_OS + TM_PIPR] =3D >> ipb_to_pipr(tctx->regs[TM_QW1_OS + TM_IPB]); >> + tctx->regs[TM_QW3_HV_PHYS + TM_PIPR] =3D >> + ipb_to_pipr(tctx->regs[TM_QW3_HV_PHYS + TM_IPB]); >> + >> =20 >> /* >> * QEMU sPAPR XIVE only. To let the controller model reset the OS >> @@ -1513,7 +1568,7 @@ static void xive_router_end_notify(XiveRouter *x= rtr, uint8_t end_blk, >> /* TODO: Auto EOI. */ >> } >> =20 >> -static void xive_router_notify(XiveFabric *xf, uint32_t lisn) >> +void xive_router_notify(XiveFabric *xf, uint32_t lisn) >> { >> XiveRouter *xrtr =3D XIVE_ROUTER(xf); >> XiveEAS eas; >> diff --git a/hw/ppc/pnv.c b/hw/ppc/pnv.c >> index 66f2301b4ece..7b0bda652338 100644 >> --- a/hw/ppc/pnv.c >> +++ b/hw/ppc/pnv.c >> @@ -279,7 +279,10 @@ static void pnv_dt_chip(PnvChip *chip, void *fdt) >> pnv_dt_core(chip, pnv_core, fdt); >> =20 >> /* Interrupt Control Presenters (ICP). One per core. */ >> - pnv_dt_icp(chip, fdt, pnv_core->pir, CPU_CORE(pnv_core)->nr_t= hreads); >> + if (!pnv_chip_is_power9(chip)) { >> + pnv_dt_icp(chip, fdt, pnv_core->pir, >> + CPU_CORE(pnv_core)->nr_threads); >> + } >> } >> =20 >> if (chip->ram_size) { >> @@ -693,7 +696,15 @@ static uint32_t pnv_chip_core_pir_p9(PnvChip *chi= p, uint32_t core_id) >> static Object *pnv_chip_power9_intc_create(PnvChip *chip, Object *chi= ld, >> Error **errp) >> { >> - return NULL; >> + Pnv9Chip *chip9 =3D PNV9_CHIP(chip); >> + >> + /* >> + * The core creates its interrupt presenter but the XIVE interrup= t >> + * controller object is initialized afterwards. Hopefully, it's >> + * only used at runtime. >> + */ >> + return xive_tctx_create(child, TYPE_XIVE_TCTX, >> + XIVE_ROUTER(&chip9->xive), errp); >> } >> =20 >> /* Allowed core identifiers on a POWER8 Processor Chip : >> @@ -875,11 +886,19 @@ static void pnv_chip_power8nvl_class_init(Object= Class *klass, void *data) >> =20 >> static void pnv_chip_power9_instance_init(Object *obj) >> { >> + Pnv9Chip *chip9 =3D PNV9_CHIP(obj); >> + >> + object_initialize(&chip9->xive, sizeof(chip9->xive), TYPE_PNV_XIV= E); >> + object_property_add_child(obj, "xive", OBJECT(&chip9->xive), NULL= ); >> + object_property_add_const_link(OBJECT(&chip9->xive), "chip", obj, >> + &error_abort); >> } >> =20 >> static void pnv_chip_power9_realize(DeviceState *dev, Error **errp) >> { >> PnvChipClass *pcc =3D PNV_CHIP_GET_CLASS(dev); >> + Pnv9Chip *chip9 =3D PNV9_CHIP(dev); >> + PnvChip *chip =3D PNV_CHIP(dev); >> Error *local_err =3D NULL; >> =20 >> pcc->parent_realize(dev, &local_err); >> @@ -887,6 +906,24 @@ static void pnv_chip_power9_realize(DeviceState *= dev, Error **errp) >> error_propagate(errp, local_err); >> return; >> } >> + >> + object_property_set_int(OBJECT(&chip9->xive), PNV9_XIVE_IC_BASE(c= hip), >> + "ic-bar", &error_fatal); >> + object_property_set_int(OBJECT(&chip9->xive), PNV9_XIVE_VC_BASE(c= hip), >> + "vc-bar", &error_fatal); >> + object_property_set_int(OBJECT(&chip9->xive), PNV9_XIVE_PC_BASE(c= hip), >> + "pc-bar", &error_fatal); >> + object_property_set_int(OBJECT(&chip9->xive), PNV9_XIVE_TM_BASE(c= hip), >> + "tm-bar", &error_fatal); >> + object_property_set_bool(OBJECT(&chip9->xive), true, "realized", >> + &local_err); >> + if (local_err) { >> + error_propagate(errp, local_err); >> + return; >> + } >> + qdev_set_parent_bus(DEVICE(&chip9->xive), sysbus_get_default()); >> + pnv_xscom_add_subregion(chip, PNV9_XSCOM_XIVE_BASE, >> + &chip9->xive.xscom_regs); >> } >> =20 >> static void pnv_chip_power9_class_init(ObjectClass *klass, void *data= ) >> @@ -1087,12 +1124,23 @@ static void pnv_pic_print_info(InterruptStatsP= rovider *obj, >> CPU_FOREACH(cs) { >> PowerPCCPU *cpu =3D POWERPC_CPU(cs); >> =20 >> - icp_pic_print_info(ICP(cpu->intc), mon); >> + if (pnv_chip_is_power9(pnv->chips[0])) { >> + xive_tctx_pic_print_info(XIVE_TCTX(cpu->intc), mon); >> + } else { >> + icp_pic_print_info(ICP(cpu->intc), mon); >> + } >> } >> =20 >> for (i =3D 0; i < pnv->num_chips; i++) { >> - Pnv8Chip *chip8 =3D PNV8_CHIP(pnv->chips[i]); >> - ics_pic_print_info(&chip8->psi.ics, mon); >> + PnvChip *chip =3D pnv->chips[i]; >> + >> + if (pnv_chip_is_power9(pnv->chips[i])) { >> + Pnv9Chip *chip9 =3D PNV9_CHIP(chip); >> + pnv_xive_pic_print_info(&chip9->xive, mon); >> + } else { >> + Pnv8Chip *chip8 =3D PNV8_CHIP(chip); >> + ics_pic_print_info(&chip8->psi.ics, mon); >> + } >> } >> } >> =20 >> diff --git a/hw/intc/Makefile.objs b/hw/intc/Makefile.objs >> index dd4d69db2bdd..145bfaf44014 100644 >> --- a/hw/intc/Makefile.objs >> +++ b/hw/intc/Makefile.objs >> @@ -40,7 +40,7 @@ obj-$(CONFIG_XICS_KVM) +=3D xics_kvm.o >> obj-$(CONFIG_XIVE) +=3D xive.o >> obj-$(CONFIG_XIVE_SPAPR) +=3D spapr_xive.o spapr_xive_hcall.o >> obj-$(CONFIG_XIVE_KVM) +=3D spapr_xive_kvm.o >> -obj-$(CONFIG_POWERNV) +=3D xics_pnv.o >> +obj-$(CONFIG_POWERNV) +=3D xics_pnv.o pnv_xive.o >> obj-$(CONFIG_ALLWINNER_A10_PIC) +=3D allwinner-a10-pic.o >> obj-$(CONFIG_S390_FLIC) +=3D s390_flic.o >> obj-$(CONFIG_S390_FLIC_KVM) +=3D s390_flic_kvm.o >=20