From mboxrd@z Thu Jan 1 00:00:00 1970 Received: from eggs.gnu.org ([2001:4830:134:3::10]:57532) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1XoU5Z-0007lN-2r for qemu-devel@nongnu.org; Wed, 12 Nov 2014 04:19:54 -0500 Received: from Debian-exim by eggs.gnu.org with spam-scanned (Exim 4.71) (envelope-from ) id 1XoU5P-0001DX-UY for qemu-devel@nongnu.org; Wed, 12 Nov 2014 04:19:45 -0500 Received: from e06smtp16.uk.ibm.com ([195.75.94.112]:55506) by eggs.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1XoU5P-0001DQ-LQ for qemu-devel@nongnu.org; Wed, 12 Nov 2014 04:19:35 -0500 Received: from /spool/local by e06smtp16.uk.ibm.com with IBM ESMTP SMTP Gateway: Authorized Use Only! Violators will be prosecuted for from ; Wed, 12 Nov 2014 09:19:33 -0000 Received: from b06cxnps3074.portsmouth.uk.ibm.com (d06relay09.portsmouth.uk.ibm.com [9.149.109.194]) by d06dlp03.portsmouth.uk.ibm.com (Postfix) with ESMTP id 112201B0806E for ; Wed, 12 Nov 2014 09:19:41 +0000 (GMT) Received: from d06av09.portsmouth.uk.ibm.com (d06av09.portsmouth.uk.ibm.com [9.149.37.250]) by b06cxnps3074.portsmouth.uk.ibm.com (8.14.9/8.14.9/NCO v10.0) with ESMTP id sAC9JWqG7406050 for ; Wed, 12 Nov 2014 09:19:32 GMT Received: from d06av09.portsmouth.uk.ibm.com (localhost [127.0.0.1]) by d06av09.portsmouth.uk.ibm.com (8.14.4/8.14.4/NCO v10.0 AVout) with ESMTP id sAC9JVwI031420 for ; Wed, 12 Nov 2014 02:19:31 -0700 Date: Wed, 12 Nov 2014 10:19:31 +0100 From: Frank Blaschka Message-ID: <20141112091930.GA45236@tuxmaker.boeblingen.de.ibm.com> References: <1415629216-59652-3-git-send-email-blaschka@linux.vnet.ibm.com> <5460E025.60004@suse.de> <20141111121015.GA35338@tuxmaker.boeblingen.de.ibm.com> <5461FE04.2050400@suse.de> <20141111123911.GA16205@tuxmaker.boeblingen.de.ibm.com> <877C2703-5899-460C-BFF8-8856B6382F4C@suse.de> <20141111140829.GC16205@tuxmaker.boeblingen.de.ibm.com> <54622A28.6020507@suse.de> <20141112084944.GA16686@tuxmaker.boeblingen.de.ibm.com> <54632383.7080103@suse.de> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <54632383.7080103@suse.de> Subject: Re: [Qemu-devel] [PATCH 2/3] s390: implement pci instructions List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , To: Alexander Graf Cc: "peter.maydell@linaro.org" , Frank Blaschka , "james.hogan@imgtec.com" , "mtosatti@redhat.com" , "qemu-devel@nongnu.org" , "borntraeger@de.ibm.com" , "cornelia.huck@de.ibm.com" , "pbonzini@redhat.com" , "rth@twiddle.net" On Wed, Nov 12, 2014 at 10:08:19AM +0100, Alexander Graf wrote: > > > On 12.11.14 09:49, Frank Blaschka wrote: > > On Tue, Nov 11, 2014 at 04:24:24PM +0100, Alexander Graf wrote: > >> > >> > >> On 11.11.14 15:08, Frank Blaschka wrote: > >>> On Tue, Nov 11, 2014 at 01:51:25PM +0100, Alexander Graf wrote: > >>>> > >>>> > >>>> > >>>>> Am 11.11.2014 um 13:39 schrieb Frank Blaschka : > >>>>> > >>>>>> On Tue, Nov 11, 2014 at 01:16:04PM +0100, Alexander Graf wrote: > >>>>>> > >>>>>> > >>>>>>> On 11.11.14 13:10, Frank Blaschka wrote: > >>>>>>>> On Mon, Nov 10, 2014 at 04:56:21PM +0100, Alexander Graf wrote: > >>>>>>>> > >>>>>>>> > >>>>>>>>> On 10.11.14 15:20, Frank Blaschka wrote: > >>>>>>>>> From: Frank Blaschka > >>>>>>>>> > >>>>>>>>> This patch implements the s390 pci instructions in qemu. It allows > >>>>>>>>> to access and drive pci devices attached to the s390 pci bus. > >>>>>>>>> Because of platform constrains devices using IO BARs are not > >>>>>>>>> supported. Also a device has to support MSI/MSI-X to run on s390. > >>>>>>>>> > >>>>>>>>> Signed-off-by: Frank Blaschka > >>>>>>>>> --- > >>>>>>>>> target-s390x/Makefile.objs | 2 +- > >>>>>>>>> target-s390x/kvm.c | 52 ++++ > >>>>>>>>> target-s390x/pci_ic.c | 753 +++++++++++++++++++++++++++++++++++++++++++++ > >>>>>>>>> target-s390x/pci_ic.h | 335 ++++++++++++++++++++ > >>>>>>>>> 4 files changed, 1141 insertions(+), 1 deletion(-) > >>>>>>>>> create mode 100644 target-s390x/pci_ic.c > >>>>>>>>> create mode 100644 target-s390x/pci_ic.h > >>>>>>>>> > >>>>>> > >>>>>> [...] > >>>>>> > >>>>>>>>> +int kvm_pcilg_service_call(S390CPU *cpu, struct kvm_run *run) > >>>>>>>>> +{ > >>>>>>>>> + CPUS390XState *env = &cpu->env; > >>>>>>>>> + S390PCIBusDevice *pbdev; > >>>>>>>>> + uint8_t r1 = (run->s390_sieic.ipb & 0x00f00000) >> 20; > >>>>>>>>> + uint8_t r2 = (run->s390_sieic.ipb & 0x000f0000) >> 16; > >>>>>>>>> + PciLgStg *rp; > >>>>>>>>> + uint64_t offset; > >>>>>>>>> + uint64_t data; > >>>>>>>>> + uint8_t len; > >>>>>>>>> + > >>>>>>>>> + cpu_synchronize_state(CPU(cpu)); > >>>>>>>>> + > >>>>>>>>> + if (env->psw.mask & PSW_MASK_PSTATE) { > >>>>>>>>> + program_interrupt(env, PGM_PRIVILEGED, 4); > >>>>>>>>> + return 0; > >>>>>>>>> + } > >>>>>>>>> + > >>>>>>>>> + if (r2 & 0x1) { > >>>>>>>>> + program_interrupt(env, PGM_SPECIFICATION, 4); > >>>>>>>>> + return 0; > >>>>>>>>> + } > >>>>>>>>> + > >>>>>>>>> + rp = (PciLgStg *)&env->regs[r2]; > >>>>>>>>> + offset = env->regs[r2 + 1]; > >>>>>>>>> + > >>>>>>>>> + pbdev = s390_pci_find_dev_by_fh(rp->fh); > >>>>>>>>> + if (!pbdev) { > >>>>>>>>> + DPRINTF("pcilg no pci dev\n"); > >>>>>>>>> + setcc(cpu, ZPCI_PCI_LS_INVAL_HANDLE); > >>>>>>>>> + return 0; > >>>>>>>>> + } > >>>>>>>>> + > >>>>>>>>> + len = rp->len & 0xF; > >>>>>>>>> + if (rp->pcias < 6) { > >>>>>>>>> + if ((8 - (offset & 0x7)) < len) { > >>>>>>>>> + program_interrupt(env, PGM_OPERAND, 4); > >>>>>>>>> + return 0; > >>>>>>>>> + } > >>>>>>>>> + MemoryRegion *mr = pbdev->pdev->io_regions[rp->pcias].memory; > >>>>>>>>> + io_mem_read(mr, offset, &data, len); > >>>>>>>>> + } else if (rp->pcias == 15) { > >>>>>>>>> + if ((4 - (offset & 0x3)) < len) { > >>>>>>>>> + program_interrupt(env, PGM_OPERAND, 4); > >>>>>>>>> + return 0; > >>>>>>>>> + } > >>>>>>>>> + data = pci_host_config_read_common( > >>>>>>>>> + pbdev->pdev, offset, pci_config_size(pbdev->pdev), len); > >>>>>>>>> + > >>>>>>>>> + switch (len) { > >>>>>>>>> + case 1: > >>>>>>>>> + break; > >>>>>>>>> + case 2: > >>>>>>>>> + data = cpu_to_le16(data); > >>>>>>>>> + break; > >>>>>>>>> + case 4: > >>>>>>>>> + data = cpu_to_le32(data); > >>>>>>>>> + break; > >>>>>>>>> + case 8: > >>>>>>>>> + data = cpu_to_le64(data); > >>>>>>>>> + break; > >>>>>>>> > >>>>>>>> Why? Also, this is wrong. cpu_to_le64 convert between host endianness > >>>>>>>> and LE. So if you're running this on an LE host, you won't swap the > >>>>>>>> value and get a broken result. > >>>>>>>> > >>>>>>>> If you know that the value is always swapped, use bswapxx(). > >>>>>>>> > >>>>>>> > >>>>>>> Actually the code is right and required for a big endian host :-) > >>>>>>> pcilg/pcistg provide access to the PCI config space which is defined > >>>>>>> as PCI byte order (little endian). Since pci_host_config_read_common does > >>>>>>> already a le to cpu conversion we have to convert back to PCI byte order. > >>>>>>> Doing an unconditional swap would be a bug on a little endian host. > >>>>>> > >>>>>> Why would it be a bug? The value you end up writing is contents of a > >>>>>> register and thus doesn't have endianness. So if QEMU was an LE process, > >>>>> > >>>>> No, the s390 guest executing pcilg instruction expects to receive config space data > >>>>> in PCI byte order. > >>>>> > >>>>>> the value of data would be identical as on a BE QEMU before your swab. > >>>>>> After the swab, it would be bswap'ed on BE, but not LE. So LE hosts break. > >>>>>> > >>>>> > >>>>> Again on BE endian host we do the swap because of pci_host_config_read_common does > >>>>> read the value and do a byte swap for that value, but we need PCI byte order not BE here. > >>>>> > >>>>> On LE host pci_host_config_read_common does not do a byte swap so we do not have to > >>>>> convert back to PCI byte order. > >>>> > >>>> We maintain the PCI config space always in LE byte order in memory, that's why there is a bwap in its read function. The return result of the read function however is always the same, regardless of LE or BE host. If I do a read of size 4, I will always get 0x1, not 0x01000000 returned. > >>>> > >>>> So now you need to convert that 0x1 into a 0x01000000 manually here because some architect thought that registers have endianness (which they don't). But you need to do it always, even on an LE host, because the pci config space return value is identical on LE and BE. > >>>> > >>> so you tell me pci_host_config_read_common does not end up in pci_default_read_config? > >>> > >>> uint32_t pci_default_read_config(PCIDevice *d, > >>> uint32_t address, int len) > >>> { > >>> uint32_t val = 0; > >>> > >>> memcpy(&val, d->config + address, len); > >>> return le32_to_cpu(val); > >>> } > >>> > >>> What did I miss? > >> > >> That's exactly where you end up in - and it's there to convert from the > >> PCI config space backing storage to a native number. > >> > >> Imagine you write 0x12345678 at offset 0. Because PCI config space is > >> defined to be LE, in the PCI config space memory this gets stored as > >> > >> 78 56 34 12 > >> > >> The reason we do the internal storage of the config space that way is > >> that it's (in some PCI implementations) legal to access with single byte > >> granularities. So you could do a pci_config_read(offset = 1) which > >> should return 0x56. > >> > >> However, that means we completely nullify any effect of host endianness > >> in the PCI config layer already. So if you do pci_config_write(offset = > >> 0, size = 4, value = 0x12345678), the contents of d->config will always > >> be identical, regardless of host endianness. The same holds true for > >> pci_config_read(offset = 0, size = 4). It will always return 0x12345678. > >> > > > > I understood this from the beginning and I completely agree to this. > > > >> In your code, you swab that value again. I assume there's a reason > >> you're swapping it and that it's the way the architecture expects it > > > > Yes, s390 pcilg architecture states: > > Data in the PCI configuration space are treated > > as being in little-endian byte ordering > > > >> (mind to point me to the respective spec so I can verify?). But if the > >> architecture expects it, then it expects it regardless of host > >> endianness. The contents of regs[r1] should always be 0x78563412, no > >> matter whether we're in an LE or a BE environment. > >> > >> Does that make sense now? > >> > > Absolutely lets make an example for qemu running on BE and LE > > > > byte order config space backing pci_default_read_config pcilg (with cpu_to_le) > > BE 0x78563412 0x12345678 0x78563412 > > LE 0x78563412 0x78563412 0x78563412 > > No, pci_default_read_config() always returns 0x12345678 because it > returns a register, not memory. > You mean implementation of pci_default_read_config is broken? If it should return a register it should not do "return le32_to_cpu(val);" > > Alex > > > > > So what is the problem with my code? Adding unconditional byte swap instead of > > cpu_to_le in pcilg would break architecture for pcilg if qemu is running on LE > > platform. > > > >> > >> Alex > >> > > >