From mboxrd@z Thu Jan 1 00:00:00 1970 Received: from eggs.gnu.org ([2001:4830:134:3::10]:43462) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1W5Iae-0005mj-AT for qemu-devel@nongnu.org; Mon, 20 Jan 2014 12:24:57 -0500 Received: from Debian-exim by eggs.gnu.org with spam-scanned (Exim 4.71) (envelope-from ) id 1W5IaQ-0006VA-Ju for qemu-devel@nongnu.org; Mon, 20 Jan 2014 12:24:48 -0500 Received: from e34.co.us.ibm.com ([32.97.110.152]:37011) by eggs.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1W5IaQ-0006Um-C0 for qemu-devel@nongnu.org; Mon, 20 Jan 2014 12:24:34 -0500 Received: from /spool/local by e34.co.us.ibm.com with IBM ESMTP SMTP Gateway: Authorized Use Only! Violators will be prosecuted for from ; Mon, 20 Jan 2014 10:24:32 -0700 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: quoted-printable From: Michael Roth In-Reply-To: <52DC90CC.4030301@ozlabs.ru> References: <1386282785-466-1-git-send-email-mdroth@linux.vnet.ibm.com> <1386282785-466-2-git-send-email-mdroth@linux.vnet.ibm.com> <52AE6C7E.7080300@ozlabs.ru> <52AE8792.4060205@ozlabs.ru> <20140116205102.32002.50059@loki> <52DC90CC.4030301@ozlabs.ru> Message-ID: <20140120172428.22088.20255@loki> Date: Mon, 20 Jan 2014 11:24:28 -0600 Subject: Re: [Qemu-devel] [PATCH v2 01/14] spapr: populate DRC entries for root dt node List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , To: Alexey Kardashevskiy , qemu-devel@nongnu.org Cc: agraf@suse.de, ncmike@ncultra.org, paulus@samba.org, tyreld@linux.vnet.ibm.com, nfont@linux.vnet.ibm.com, qemu-ppc@nongnu.org Quoting Alexey Kardashevskiy (2014-01-19 20:58:20) > On 01/17/2014 07:51 AM, Michael Roth wrote: > > Quoting Alexey Kardashevskiy (2013-12-15 22:54:42) > >> On 12/16/2013 01:59 PM, Alexey Kardashevskiy wrote: > >>> On 12/06/2013 09:32 AM, Michael Roth wrote: > >>>> From: Nathan Fontenot > >>>> > >>>> This add entries to the root OF node to advertise our PHBs as being > >>>> DR-capable in according with PAPR specification. > >>>> > >>>> Each PHB is given a name of PHB, advertised as a PHB type, > >>>> and associated with a power domain of -1 (indicating to guests that > >>>> power management is handled automatically by hardware). > >>>> > >>>> We currently allocate entries for up to 32 DR-capable PHBs, though > >>>> this limit can be increased later. > >>>> > >>>> DrcEntry objects to track the state of the DR-connector associated > >>>> with each PHB are stored in a 32-entry array, and each DrcEntry has > >>>> in turn have a dynamically-sized number of child DR-connectors, > >>>> which we will use later to track the state of DR-connectors > >>>> associated with a PHB's physical slots. > >>>> > >>>> Signed-off-by: Nathan Fontenot > >>>> Signed-off-by: Michael Roth > >>>> --- > >>>> hw/ppc/spapr.c | 132 +++++++++++++++++++++++++++++++++++++= +++++++++++ > >>>> include/hw/ppc/spapr.h | 33 ++++++++++++ > >>>> 2 files changed, 165 insertions(+) > >>>> > >>>> diff --git a/hw/ppc/spapr.c b/hw/ppc/spapr.c > >>>> index 7e53a5f..ec3ba43 100644 > >>>> --- a/hw/ppc/spapr.c > >>>> +++ b/hw/ppc/spapr.c > >>>> @@ -81,6 +81,7 @@ > >>>> #define HTAB_SIZE(spapr) (1ULL << ((spapr)->htab_shift)) > >>>> = > >>>> sPAPREnvironment *spapr; > >>>> +DrcEntry drc_table[SPAPR_DRC_TABLE_SIZE]; > >>>> = > >>>> int spapr_allocate_irq(int hint, bool lsi) > >>>> { > >>>> @@ -276,6 +277,130 @@ static size_t create_page_sizes_prop(CPUPPCSta= te *env, uint32_t *prop, > >>>> return (p - prop) * sizeof(uint32_t); > >>>> } > >>>> = > >>>> +static void spapr_init_drc_table(void) > >>>> +{ > >>>> + int i; > >>>> + > >>>> + memset(drc_table, 0, sizeof(drc_table)); > >>>> + > >>>> + /* For now we only care about PHB entries */ > >>>> + for (i =3D 0; i < SPAPR_DRC_TABLE_SIZE; i++) { > >>>> + drc_table[i].drc_index =3D 0x2000001 + i; > >>>> + } > >>>> +} > >>>> + > >>>> +DrcEntry *spapr_add_phb_to_drc_table(uint64_t buid, uint32_t state) > >>>> +{ > >>>> + DrcEntry *empty_drc =3D NULL; > >>>> + DrcEntry *found_drc =3D NULL; > >>>> + int i, phb_index; > >>>> + > >>>> + for (i =3D 0; i < SPAPR_DRC_TABLE_SIZE; i++) { > >>>> + if (drc_table[i].phb_buid =3D=3D 0) { > >>>> + empty_drc =3D &drc_table[i]; > >>>> + } > >>>> + > >>>> + if (drc_table[i].phb_buid =3D=3D buid) { > >>>> + found_drc =3D &drc_table[i]; > >>>> + break; > >>>> + } > >>>> + } > >>>> + > >>>> + if (found_drc) { > >>>> + return found_drc; > >>>> + } > >>>> + > >>>> + if (empty_drc) { > >>>> + empty_drc->phb_buid =3D buid; > >>>> + empty_drc->state =3D state; > >>>> + empty_drc->cc_state.fdt =3D NULL; > >>>> + empty_drc->cc_state.offset =3D 0; > >>>> + empty_drc->cc_state.depth =3D 0; > >>>> + empty_drc->cc_state.state =3D CC_STATE_IDLE; > >>>> + empty_drc->child_entries =3D > >>>> + g_malloc0(sizeof(DrcEntry) * SPAPR_DRC_PHB_SLOT_MAX); > >>>> + phb_index =3D buid - SPAPR_PCI_BASE_BUID; > >>>> + for (i =3D 0; i < SPAPR_DRC_PHB_SLOT_MAX; i++) { > >>>> + empty_drc->child_entries[i].drc_index =3D > >>>> + SPAPR_DRC_DEV_ID_BASE + (phb_index << 8) + (i << 3); > >>>> + } > >>>> + return empty_drc; > >>>> + } > >>>> + > >>>> + return NULL; > >>>> +} > >>>> + > >>>> +static void spapr_create_drc_dt_entries(void *fdt) > >>>> +{ > >>>> + char char_buf[1024]; > >>>> + uint32_t int_buf[SPAPR_DRC_TABLE_SIZE + 1]; > >>>> + uint32_t *entries; > >>>> + int offset, fdt_offset; > >>>> + int i, ret; > >>>> + > >>>> + fdt_offset =3D fdt_path_offset(fdt, "/"); > >>>> + > >>>> + /* ibm,drc-indexes */ > >>>> + memset(int_buf, 0, sizeof(int_buf)); > >>>> + int_buf[0] =3D SPAPR_DRC_TABLE_SIZE; > >>>> + > >>>> + for (i =3D 1; i <=3D SPAPR_DRC_TABLE_SIZE; i++) { > >>>> + int_buf[i] =3D drc_table[i-1].drc_index; > >>>> + } > >>>> + > >>>> + ret =3D fdt_setprop(fdt, fdt_offset, "ibm,drc-indexes", int_buf, > >>>> + sizeof(int_buf)); > >>>> + if (ret) { > >>>> + fprintf(stderr, "Couldn't finalize ibm,drc-indexes property= \n"); > >>>> + } > >>>> + > >>>> + /* ibm,drc-power-domains */ > >>>> + memset(int_buf, 0, sizeof(int_buf)); > >>>> + int_buf[0] =3D SPAPR_DRC_TABLE_SIZE; > >>>> + > >>>> + for (i =3D 1; i <=3D SPAPR_DRC_TABLE_SIZE; i++) { > >>>> + int_buf[i] =3D 0xffffffff; > >>>> + } > >>>> + > >>>> + ret =3D fdt_setprop(fdt, fdt_offset, "ibm,drc-power-domains", i= nt_buf, > >>>> + sizeof(int_buf)); > >>>> + if (ret) { > >>>> + fprintf(stderr, "Couldn't finalize ibm,drc-power-domains pr= operty\n"); > >>>> + } > >>>> + > >>>> + /* ibm,drc-names */ > >>>> + memset(char_buf, 0, sizeof(char_buf)); > >>>> + entries =3D (uint32_t *)&char_buf[0]; > >>>> + *entries =3D SPAPR_DRC_TABLE_SIZE; > >>>> + offset =3D sizeof(*entries); > >>>> + > >>>> + for (i =3D 0; i < SPAPR_DRC_TABLE_SIZE; i++) { > >>>> + offset +=3D sprintf(char_buf + offset, "PHB %d", i + 1); > >>>> + char_buf[offset++] =3D '\0'; > >>>> + } > >>>> + > >>>> + ret =3D fdt_setprop(fdt, fdt_offset, "ibm,drc-names", char_buf,= offset); > >>>> + if (ret) { > >>>> + fprintf(stderr, "Couldn't finalize ibm,drc-names property\n= "); > >>>> + } > >>>> + > >>>> + /* ibm,drc-types */ > >>>> + memset(char_buf, 0, sizeof(char_buf)); > >>>> + entries =3D (uint32_t *)&char_buf[0]; > >>>> + *entries =3D SPAPR_DRC_TABLE_SIZE; > >>>> + offset =3D sizeof(*entries); > >>>> + > >>>> + for (i =3D 0; i < SPAPR_DRC_TABLE_SIZE; i++) { > >>>> + offset +=3D sprintf(char_buf + offset, "PHB"); > >>>> + char_buf[offset++] =3D '\0'; > >>>> + } > >>>> + > >>>> + ret =3D fdt_setprop(fdt, fdt_offset, "ibm,drc-types", char_buf,= offset); > >>>> + if (ret) { > >>>> + fprintf(stderr, "Couldn't finalize ibm,drc-types property\n= "); > >>>> + } > >>>> +} > >>>> + > >>>> #define _FDT(exp) \ > >>>> do { \ > >>>> int ret =3D (exp); = \ > >>>> @@ -307,6 +432,8 @@ static void *spapr_create_fdt_skel(hwaddr initrd= _base, > >>>> int i, smt =3D kvmppc_smt_threads(); > >>>> unsigned char vec5[] =3D {0x0, 0x0, 0x0, 0x0, 0x0, 0x80}; > >>>> = > >>>> + spapr_init_drc_table(); > >>>> + > >>>> fdt =3D g_malloc0(FDT_MAX_SIZE); > >>>> _FDT((fdt_create(fdt, FDT_MAX_SIZE))); > >>>> = > >>>> @@ -590,6 +717,7 @@ static void spapr_finalize_fdt(sPAPREnvironment = *spapr, > >>>> int ret; > >>>> void *fdt; > >>>> sPAPRPHBState *phb; > >>>> + DrcEntry *drc_entry; > >>>> = > >>>> fdt =3D g_malloc(FDT_MAX_SIZE); > >>>> = > >>>> @@ -609,6 +737,8 @@ static void spapr_finalize_fdt(sPAPREnvironment = *spapr, > >>>> } > >>>> = > >>>> QLIST_FOREACH(phb, &spapr->phbs, list) { > >>>> + drc_entry =3D spapr_add_phb_to_drc_table(phb->buid, 2 /* Un= usable */); > >>>> + g_assert(drc_entry); > >>>> ret =3D spapr_populate_pci_dt(phb, PHANDLE_XICP, fdt); > >>>> } > >>>> = > >>>> @@ -633,6 +763,8 @@ static void spapr_finalize_fdt(sPAPREnvironment = *spapr, > >>>> spapr_populate_chosen_stdout(fdt, spapr->vio_bus); > >>>> } > >>>> = > >>>> + spapr_create_drc_dt_entries(fdt); > >>>> + > >>>> _FDT((fdt_pack(fdt))); > >>>> = > >>>> if (fdt_totalsize(fdt) > FDT_MAX_SIZE) { > >>>> diff --git a/include/hw/ppc/spapr.h b/include/hw/ppc/spapr.h > >>>> index b2f11e9..0f2e705 100644 > >>>> --- a/include/hw/ppc/spapr.h > >>>> +++ b/include/hw/ppc/spapr.h > >>>> @@ -299,6 +299,39 @@ typedef struct sPAPREnvironment { > >>>> #define KVMPPC_H_LOGICAL_MEMOP (KVMPPC_HCALL_BASE + 0x1) > >>>> #define KVMPPC_HCALL_MAX KVMPPC_H_LOGICAL_MEMOP > >>>> = > >>>> +/* For dlparable/hotpluggable slots */ > >>>> +#define SPAPR_DRC_TABLE_SIZE 32 > >>>> +#define SPAPR_DRC_PHB_SLOT_MAX 32 > >>>> +#define SPAPR_DRC_DEV_ID_BASE 0x40000000 > >>>> + > >>>> +typedef struct ConfigureConnectorState { > >>>> + void *fdt; > >>>> + int offset_start; > >>>> + int offset; > >>>> + int depth; > >>>> + PCIDevice *dev; > >>>> + enum { > >>>> + CC_STATE_IDLE =3D 0, > >>>> + CC_STATE_PENDING =3D 1, > >>>> + CC_STATE_ACTIVE, > >>>> + } state; > >>>> +} ConfigureConnectorState; > >>>> + > >>>> +typedef struct DrcEntry DrcEntry; > >>>> + > >>>> +struct DrcEntry { > >>>> + uint32_t drc_index; > >>>> + uint64_t phb_buid; > >>>> + void *fdt; > >>>> + int fdt_offset; > >>>> + uint32_t state; > >>>> + ConfigureConnectorState cc_state; > >>>> + DrcEntry *child_entries; > >>>> +}; > >>>> + > >>>> +extern DrcEntry drc_table[SPAPR_DRC_TABLE_SIZE]; > >>>> +DrcEntry *spapr_add_phb_to_drc_table(uint64_t buid, uint32_t state); > >>>> + > >>>> extern sPAPREnvironment *spapr; > >>> > >>> So far we were trying to keep everything sPAPR-related in sPAPREnviro= nment. > >>> Is @drc_table really that special? > >> > >> > >> One more note - we are trying to add a "spapr" or "sPAPR" prefix to all > >> global types defines in headers (such as sPAPRPHBState, spapr_pci_lsi, > >> VIOsPAPRBus, sPAPREnvironment), it would be nice to have "spapr" in so= me > >> form in these new types too. > >> > >> Or we could move the whole patch (except spapr_create_drc_dt_entries()= ) to > >> hw/ppc/spapr_pci.c (and keep the original names) as it seems to be the= only > >> user of the whole DrcEntry and ConfigureConnectorState thing. > >> And put a pointer to drc_table[] into @spapr (or make it static?) > > = > > That would work, but I think we'd need to move spapr_create_drc_dt_entr= ies() > > as well, or the bits that rely on DrcEntry at least. Though I worry > > about scoping DrcEntry to spapr_pci.c at this early stage, as DR-capable > > components other than PCI may come to rely on state that's captured by = the > > DrcEntry nodes, such as boot-time FDT generation and run-time management > > (via ibm,configure-connector) of CPUs and memory. > > = > > Assuming that seems like a reasonable expectation, I think I'd prefer t= he > > first option of using spapr-specific prefixes for global types and movi= ng > > drc_table into sPAPREnvironment > = > = > I did not realize DRC is not just for PCI. How hard would it be to add hot > plug support for a whole PHB? The current QEMU trend is to make QEMU > monitor's "device_add" equal to the command line's "-device" which is not > (yet) true for PHB but could be. Thanks. > = Would need to look at it a bit more closely to say for certain, but after discussing it a bit Tyrel/Mike, I think the main considerations would be: 1) PHB hotplug/unplug would need to signal a different event type in it's check-exception/epow message, we have stubs in place for a PHB event typ= e, so that's mostly a matter of adding special-casing in the hotplug callba= ck for spapr-pci-host-bridge devices 2) The required properties for the OF node corresponding PHB will be differ= ent. Currently these are generated as part of the hotplug callback, and attac= hed to the corresponding ConfigureConnectorState node to be fed to the guest via subsequent ibm,configure-connector RTAS calls, so we'd just hook the PHB's OF node generation code in there as. 3) The sysctl/kernel interface for handling PHB hotplug would be different, we'd be relying on the rpadlar kernel module (/sys/bus/pci/slots/control/add_slot) rather than rpaphp (/sys/bus/pci/slots//power) or the PCI rescan fallback. This is mostly a matter of modifying the handling in the guest tools, na= mely in rtas_errd, to handle the event accordingly. We also haven't done anything extensive using rpadlpar operations within qe= mu guests, so there may be various odds/ends and possibly kernel changes neede= d to get that working properly (as was the case for rpaphp, though thanks to the= PCI rescan workaround a new kernel isn't required for existing guests... a simi= lar fallback likely won't be available for rpadlpar) But from a high-level view at least it seems fairly straight-forward. I'll = see if we can get a prototype working. > = > = > >> The only remaining user of DrcEntry is spapr_hotplug_req_event() but t= his > >> can be easily fixed by small helper like this: > >> > >> int spapr_phb_slot_to_drc_index(uint64_t buid, int slot) > >> { > >> DrcEntry *drc_entry =3D spapr_phb_to_drc_entry(phb->buid); > >> if (!drc_entry) { > >> drc_entry =3D spapr_add_phb_to_drc_table(phb->buid, 2); > >> } > >> return drc_entry->child_entries[slot].drc_index; > >> } > = > = > = > = > -- = > Alexey