From mboxrd@z Thu Jan 1 00:00:00 1970 Received: from eggs.gnu.org ([2001:4830:134:3::10]:58410) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1XPZzR-0000q5-Sf for qemu-devel@nongnu.org; Thu, 04 Sep 2014 12:34:37 -0400 Received: from Debian-exim by eggs.gnu.org with spam-scanned (Exim 4.71) (envelope-from ) id 1XPZzK-0004SB-4f for qemu-devel@nongnu.org; Thu, 04 Sep 2014 12:34:29 -0400 Received: from e35.co.us.ibm.com ([32.97.110.153]:57461) by eggs.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1XPZzJ-0004Rw-T1 for qemu-devel@nongnu.org; Thu, 04 Sep 2014 12:34:22 -0400 Received: from /spool/local by e35.co.us.ibm.com with IBM ESMTP SMTP Gateway: Authorized Use Only! Violators will be prosecuted for from ; Thu, 4 Sep 2014 10:34:20 -0600 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: quoted-printable From: Michael Roth In-Reply-To: <20140904161215.32021.65613@loki> References: <1408407718-10835-1-git-send-email-mdroth@linux.vnet.ibm.com> <1408407718-10835-10-git-send-email-mdroth@linux.vnet.ibm.com> <20140903230331.32021.36244@loki> <20140904161215.32021.65613@loki> Message-ID: <20140904163414.32021.85667@loki> Date: Thu, 04 Sep 2014 11:34:14 -0500 Subject: Re: [Qemu-devel] [PATCH 09/12] spapr_pci: enable basic hotplug operations List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , To: Bharata B Rao Cc: aik@ozlabs.ru, "qemu-devel@nongnu.org" , Alexander Graf , ncmike@ncultra.org, "qemu-ppc@nongnu.org" , tyreld@linux.vnet.ibm.com, nfont@linux.vnet.ibm.com Quoting Michael Roth (2014-09-04 11:12:15) > Quoting Bharata B Rao (2014-09-04 10:08:20) > > On Thu, Sep 4, 2014 at 4:33 AM, Michael Roth wrote: > > >> > +static int spapr_device_hotplug_add(DeviceState *qdev, PCIDevice = *dev) > > >> > +{ > > >> > + sPAPRPHBState *phb =3D SPAPR_PCI_HOST_BRIDGE(qdev); > > >> > + sPAPRDrcEntry *drc_entry, *drc_entry_slot; > > >> > + sPAPRConfigureConnectorState *ccs; > > >> > + int slot =3D PCI_SLOT(dev->devfn); > > >> > + int offset, ret; > > >> > + void *fdt_orig, *fdt; > > >> > + char nodename[512]; > > >> > + uint32_t encoded =3D ENCODE_DRC_STATE(INDICATOR_ENTITY_SENSE_= PRESENT, > > >> > + INDICATOR_ENTITY_SENSE_MA= SK, > > >> > + INDICATOR_ENTITY_SENSE_SH= IFT); > > >> > + > > >> > > >> I am building on this infrastructure of yours and adding CPU hotplug > > >> support to sPAPR guests. > > >> > > >> So we start with dr state of UNUSABLE and change it to PRESENT like > > >> above when performing hotplug operation. But after this, in case of > > >> CPU hotplug, the CPU hotplug code in the kernel > > >> (arch/powerpc/platforms/pseries/dlpar.c:dlpar_acquire_drc()) does > > >> get-sensor-state rtas call and expects the dr state to be UNUSABLE. = Is > > >> the guest kernel right in expecting dr state to be in UNUSABLE state > > >> like this ? I have in fact disabled this check in the guest kernel to > > >> get a CPU hotplugged successfully, but wanted to understand the state > > >> changes and the expectations from the guest kernel correctly. > > > > > > According to PAPR+ 2.7 13.5.3.3, > > > > > > PRESENT (1): > > > > > > Returned for logical and physical DR entities when the DR connector= is > > > allocated to the OS and the DR entity is present. For physical DR e= ntities, > > > this indicates that the DR connector actually has a DR entity plugg= ed into > > > it. For DR connectors of physical DR entities, the DR connector mus= t be > > > allocated to the OS to return this value, otherwise a status of -3,= no such > > > sensor implemented, will be returned from the get-sensor-state RTAS= call. For > > > DR connectors of logical DR entities, the DR connector must be allo= cated to > > > the OS to return this value, otherwise a sensor value of 2 or 3 wil= l be > > > returned. > > > > > > UNUSABLE (2): > > > > > > Returned for logical DR entities when the DR entity is not currently > > > available to the OS, but may possibly be made available to the OS b= y calling > > > set-indicator with the allocation-state indicator, setting that ind= icator to > > > usable. > > > > > > So it seems 'PRESENT' is in fact the right value immediately after PCI > > > hotplug, but it doesn't seem clear from the documentation whether 'PR= ESENT' > > > or 'UNUSABLE' is more correct immediately after CPU hotplug. What does > > > seem clear as that UNUSABLE does not have any use in the case of PCI > > > devices: just PRESENT/EMPTY(0). > > > > > > But we actually 0-initialize the sensor field for DRCEntrys correspon= ding > > > to PCI devices, which corresponds to 'EMPTY' (0), so the handling see= ms > > > correct for PCI devices... > > = > > Thanks Michael for the above information on PRESENT and USABLE states. > > = > > > > > > And since we already initialize PHB sensors to UNUSABLE in the top-le= vel > > > DRC list, I'm not sure why adjacent CPU entries would be affected by = what > > > we do later for PCI devices? > > = > > Sorry if I wasn't clear enough in my previous mail. CPU hotplug isn't > > affected by what you do for PCI devices, but... > > = > > > It seems like you'd just need to do the > > > equivalent of what we do for PHBs during realize: > > = > > when I try to do the same state changes for CPU hotplug, things don't > > work as expected. > > = > > > > > > spapr_add_phb_to_drc_table(sphb->buid, 2 /* Unusable */); > > > > > > So I'm not sure where the need for guest kernel changes is coming fro= m for > > > CPU hotplug. > > = > > When the resource is hotplugged, you change the state from UNUSABLE to > > PRESENT in QEMU before signalling the guest (via check exception irq). > > But the same state change in CPU hotplug case isn't as per guest > > kernel's expectation. > > = > > > Do you have a snippet of what the initialize/hot_add hooks > > > like in your case? > > = > > I am talking about this piece of code in the the kernel in > > arch/powerpc/platforms/pseries/dlpar.c > > = > > int dlpar_acquire_drc(u32 drc_index) > > { > > int dr_status, rc; > > = > > rc =3D rtas_call(rtas_token("get-sensor-state"), 2, 2, &dr_stat= us, > > DR_ENTITY_SENSE, drc_index); > > if (rc || dr_status !=3D DR_ENTITY_UNUSABLE) > > return -1; > > ... > > } > > = > > I have circumvented this problem by not setting the state to PRESENT > > in my current hotplug patch. You can refer to the same in > > spapr_cpu_hotplug_add() routine that's part of my patch 14/15 > > (https://lists.gnu.org/archive/html/qemu-devel/2014-09/msg00757.html) > = > Yah, that's what I was getting at: at least just to get things working > for testing, just avoid the PRESENT bits in your hot_add_cpu hook rather > than patching the guest. Unfortunately the documentation isn't particular= ly > clear about which of these approaches is more correct as far as CPUs go. = But > looking at it again: > = > UNUSABLE (2): > = > Returned for logical DR entities when the DR entity is not currently > available to the OS, but may possibly be made available to the OS by c= alling > set-indicator with the allocation-state indicator, setting that indica= tor to > usable. > = > That 'usable' indicator setting is documented for set-indicator as (1), w= hich > happens to correspond to PRESENT (1). So my read would be that for 'physi= cal' > hotplug (like PCI), the firmware changes the indicator state to PRESENT/U= SABLE > immediately after hotplug, whereas with 'logical' hotplug (like PHB/CPU),= the > guest OS signals this transition to USABLE through set-indicator calls fo= r the > 9003 sensor/allocation state after hotplug (which also seems to match up = with > the kernel code). > = > This seems to correspond with the dlpar_acquire_drc() function, but I'm a > little confused why that's not also called in the PHB path...I think maybe > in that case it's handled by drmgr in userspace... will take another look > to confirm. Yah, here's the code from drmgr, same expectations: int = acquire_drc(uint32_t drc_index) { int rc; say(DEBUG, "Acquiring drc index 0x%x\n", drc_index); rc =3D dr_entity_sense(drc_index); if (rc !=3D STATE_UNUSABLE) { say(ERROR, "Entity sense failed for drc %x with %d\n%s\n", drc_index, rc, entity_sense_error(rc)); return -1; } say(DEBUG, "Setting allocation state to 'alloc usable'\n"); rc =3D rtas_set_indicator(ALLOCATION_STATE, drc_index, ALLOC_USABLE); if (rc) { say(ERROR, "Allocation failed for drc %x with %d\n%s\n", drc_index, rc, set_indicator_error(rc)); return -1; } say(DEBUG, "Setting indicator state to 'unisolate'\n"); rc =3D rtas_set_indicator(ISOLATION_STATE, drc_index, UNISOLATE); if (rc) { int ret; rc =3D -1; say(ERROR, "Unisolate failed for drc %x with %d\n%s\n", drc_index, rc, set_indicator_error(rc)); ret =3D rtas_set_indicator(ALLOCATION_STATE, drc_index, ALLOC_UNUSABLE); if (ret) { say(ERROR, "Failed recovery to unusable state after " "unisolate failure for drc %x with %d\n%s\n", drc_index, ret, set_indicator_error(ret)); } } return rc; } > = > > = > > Regards, > > Bharata.