From mboxrd@z Thu Jan  1 00:00:00 1970
Received: from eggs.gnu.org ([2001:4830:134:3::10]:42954)
	by lists.gnu.org with esmtp (Exim 4.71)
	(envelope-from <agraf@suse.de>) id 1XoDJ4-0000U7-Pp
	for qemu-devel@nongnu.org; Tue, 11 Nov 2014 10:24:42 -0500
Received: from Debian-exim by eggs.gnu.org with spam-scanned (Exim 4.71)
	(envelope-from <agraf@suse.de>) id 1XoDIx-00055G-95
	for qemu-devel@nongnu.org; Tue, 11 Nov 2014 10:24:34 -0500
Received: from cantor2.suse.de ([195.135.220.15]:38853 helo=mx2.suse.de)
	by eggs.gnu.org with esmtp (Exim 4.71)
	(envelope-from <agraf@suse.de>) id 1XoDIw-00054p-VU
	for qemu-devel@nongnu.org; Tue, 11 Nov 2014 10:24:27 -0500
Message-ID: <54622A28.6020507@suse.de>
Date: Tue, 11 Nov 2014 16:24:24 +0100
From: Alexander Graf <agraf@suse.de>
MIME-Version: 1.0
References: <1415629216-59652-1-git-send-email-blaschka@linux.vnet.ibm.com>
	<1415629216-59652-3-git-send-email-blaschka@linux.vnet.ibm.com>
	<5460E025.60004@suse.de>
	<20141111121015.GA35338@tuxmaker.boeblingen.de.ibm.com>
	<5461FE04.2050400@suse.de>
	<20141111123911.GA16205@tuxmaker.boeblingen.de.ibm.com>
	<877C2703-5899-460C-BFF8-8856B6382F4C@suse.de>
	<20141111140829.GC16205@tuxmaker.boeblingen.de.ibm.com>
In-Reply-To: <20141111140829.GC16205@tuxmaker.boeblingen.de.ibm.com>
Content-Type: text/plain; charset=windows-1252
Content-Transfer-Encoding: quoted-printable
Subject: Re: [Qemu-devel] [PATCH 2/3] s390: implement pci instructions
List-Id: <qemu-devel.nongnu.org>
List-Unsubscribe: <https://lists.nongnu.org/mailman/options/qemu-devel>,
	<mailto:qemu-devel-request@nongnu.org?subject=unsubscribe>
List-Archive: <http://lists.nongnu.org/archive/html/qemu-devel>
List-Post: <mailto:qemu-devel@nongnu.org>
List-Help: <mailto:qemu-devel-request@nongnu.org?subject=help>
List-Subscribe: <https://lists.nongnu.org/mailman/listinfo/qemu-devel>,
	<mailto:qemu-devel-request@nongnu.org?subject=subscribe>
To: Frank Blaschka <blaschka@linux.vnet.ibm.com>
Cc: "peter.maydell@linaro.org" <peter.maydell@linaro.org>, Frank Blaschka <frank.blaschka@de.ibm.com>, "james.hogan@imgtec.com" <james.hogan@imgtec.com>, "mtosatti@redhat.com" <mtosatti@redhat.com>, "qemu-devel@nongnu.org" <qemu-devel@nongnu.org>, "borntraeger@de.ibm.com" <borntraeger@de.ibm.com>, "cornelia.huck@de.ibm.com" <cornelia.huck@de.ibm.com>, "pbonzini@redhat.com" <pbonzini@redhat.com>, "rth@twiddle.net" <rth@twiddle.net>


On 11.11.14 15:08, Frank Blaschka wrote:
> On Tue, Nov 11, 2014 at 01:51:25PM +0100, Alexander Graf wrote:
>>
>>
>>
>>> Am 11.11.2014 um 13:39 schrieb Frank Blaschka <blaschka@linux.vnet.ib=
m.com>:
>>>
>>>> On Tue, Nov 11, 2014 at 01:16:04PM +0100, Alexander Graf wrote:
>>>>
>>>>
>>>>> On 11.11.14 13:10, Frank Blaschka wrote:
>>>>>> On Mon, Nov 10, 2014 at 04:56:21PM +0100, Alexander Graf wrote:
>>>>>>
>>>>>>
>>>>>>> On 10.11.14 15:20, Frank Blaschka wrote:
>>>>>>> From: Frank Blaschka <frank.blaschka@de.ibm.com>
>>>>>>>
>>>>>>> This patch implements the s390 pci instructions in qemu. It allow=
s
>>>>>>> to access and drive pci devices attached to the s390 pci bus.
>>>>>>> Because of platform constrains devices using IO BARs are not
>>>>>>> supported. Also a device has to support MSI/MSI-X to run on s390.
>>>>>>>
>>>>>>> Signed-off-by: Frank Blaschka <frank.blaschka@de.ibm.com>
>>>>>>> ---
>>>>>>> target-s390x/Makefile.objs |   2 +-
>>>>>>> target-s390x/kvm.c         |  52 ++++
>>>>>>> target-s390x/pci_ic.c      | 753 ++++++++++++++++++++++++++++++++=
+++++++++++++
>>>>>>> target-s390x/pci_ic.h      | 335 ++++++++++++++++++++
>>>>>>> 4 files changed, 1141 insertions(+), 1 deletion(-)
>>>>>>> create mode 100644 target-s390x/pci_ic.c
>>>>>>> create mode 100644 target-s390x/pci_ic.h
>>>>>>>
>>>>
>>>> [...]
>>>>
>>>>>>> +int kvm_pcilg_service_call(S390CPU *cpu, struct kvm_run *run)
>>>>>>> +{
>>>>>>> +    CPUS390XState *env =3D &cpu->env;
>>>>>>> +    S390PCIBusDevice *pbdev;
>>>>>>> +    uint8_t r1 =3D (run->s390_sieic.ipb & 0x00f00000) >> 20;
>>>>>>> +    uint8_t r2 =3D (run->s390_sieic.ipb & 0x000f0000) >> 16;
>>>>>>> +    PciLgStg *rp;
>>>>>>> +    uint64_t offset;
>>>>>>> +    uint64_t data;
>>>>>>> +    uint8_t len;
>>>>>>> +
>>>>>>> +    cpu_synchronize_state(CPU(cpu));
>>>>>>> +
>>>>>>> +    if (env->psw.mask & PSW_MASK_PSTATE) {
>>>>>>> +        program_interrupt(env, PGM_PRIVILEGED, 4);
>>>>>>> +        return 0;
>>>>>>> +    }
>>>>>>> +
>>>>>>> +    if (r2 & 0x1) {
>>>>>>> +        program_interrupt(env, PGM_SPECIFICATION, 4);
>>>>>>> +        return 0;
>>>>>>> +    }
>>>>>>> +
>>>>>>> +    rp =3D (PciLgStg *)&env->regs[r2];
>>>>>>> +    offset =3D env->regs[r2 + 1];
>>>>>>> +
>>>>>>> +    pbdev =3D s390_pci_find_dev_by_fh(rp->fh);
>>>>>>> +    if (!pbdev) {
>>>>>>> +        DPRINTF("pcilg no pci dev\n");
>>>>>>> +        setcc(cpu, ZPCI_PCI_LS_INVAL_HANDLE);
>>>>>>> +        return 0;
>>>>>>> +    }
>>>>>>> +
>>>>>>> +    len =3D rp->len & 0xF;
>>>>>>> +    if (rp->pcias < 6) {
>>>>>>> +        if ((8 - (offset & 0x7)) < len) {
>>>>>>> +            program_interrupt(env, PGM_OPERAND, 4);
>>>>>>> +            return 0;
>>>>>>> +        }
>>>>>>> +        MemoryRegion *mr =3D pbdev->pdev->io_regions[rp->pcias].=
memory;
>>>>>>> +        io_mem_read(mr, offset, &data, len);
>>>>>>> +    } else if (rp->pcias =3D=3D 15) {
>>>>>>> +        if ((4 - (offset & 0x3)) < len) {
>>>>>>> +            program_interrupt(env, PGM_OPERAND, 4);
>>>>>>> +            return 0;
>>>>>>> +        }
>>>>>>> +        data =3D  pci_host_config_read_common(
>>>>>>> +                   pbdev->pdev, offset, pci_config_size(pbdev->p=
dev), len);
>>>>>>> +
>>>>>>> +        switch (len) {
>>>>>>> +        case 1:
>>>>>>> +            break;
>>>>>>> +        case 2:
>>>>>>> +            data =3D cpu_to_le16(data);
>>>>>>> +            break;
>>>>>>> +        case 4:
>>>>>>> +            data =3D cpu_to_le32(data);
>>>>>>> +            break;
>>>>>>> +        case 8:
>>>>>>> +            data =3D cpu_to_le64(data);
>>>>>>> +            break;
>>>>>>
>>>>>> Why? Also, this is wrong. cpu_to_le64 convert between host endiann=
ess
>>>>>> and LE. So if you're running this on an LE host, you won't swap th=
e
>>>>>> value and get a broken result.
>>>>>>
>>>>>> If you know that the value is always swapped, use bswapxx().
>>>>>>
>>>>>
>>>>> Actually the code is right and required for a big endian host :-)
>>>>> pcilg/pcistg provide access to the PCI config space which is define=
d
>>>>> as PCI byte order (little endian). Since pci_host_config_read_commo=
n does
>>>>> already a le to cpu conversion we have to convert back to PCI byte =
order.
>>>>> Doing an unconditional swap would be a bug on a little endian host.
>>>>
>>>> Why would it be a bug? The value you end up writing is contents of a
>>>> register and thus doesn't have endianness. So if QEMU was an LE proc=
ess,
>>>
>>> No, the s390 guest executing pcilg instruction expects to receive con=
fig space data
>>> in PCI byte order.
>>>
>>>> the value of data would be identical as on a BE QEMU before your swa=
b.
>>>> After the swab, it would be bswap'ed on BE, but not LE. So LE hosts =
break.
>>>>
>>>
>>> Again on BE endian host we do the swap because of pci_host_config_rea=
d_common does
>>> read the value and do a byte swap for that value, but we need PCI byt=
e order not BE here.
>>>
>>> On LE host pci_host_config_read_common does not do a byte swap so we =
do not have to
>>> convert back to PCI byte order.
>>
>> We maintain the PCI config space always in LE byte order in memory, th=
at's why there is a bwap in its read function. The return result of the r=
ead function however is always the same, regardless of LE or BE host. If =
I do a read of size 4, I will always get 0x1, not 0x01000000 returned.
>>
>> So now you need to convert that 0x1 into a 0x01000000 manually here be=
cause some architect thought that registers have endianness (which they d=
on't). But you need to do it always, even on an LE host, because the pci =
config space return value is identical on LE and BE.
>>
> so you tell me pci_host_config_read_common does not end up in pci_defau=
lt_read_config?
>=20
> uint32_t pci_default_read_config(PCIDevice *d,
>                                  uint32_t address, int len)
> {
>     uint32_t val =3D 0;
>=20
>     memcpy(&val, d->config + address, len);
>     return le32_to_cpu(val);
> }
>=20
> What did I miss?

That's exactly where you end up in - and it's there to convert from the
PCI config space backing storage to a native number.

Imagine you write 0x12345678 at offset 0. Because PCI config space is
defined to be LE, in the PCI config space memory this gets stored as

78 56 34 12

The reason we do the internal storage of the config space that way is
that it's (in some PCI implementations) legal to access with single byte
granularities. So you could do a pci_config_read(offset =3D 1) which
should return 0x56.

However, that means we completely nullify any effect of host endianness
in the PCI config layer already. So if you do pci_config_write(offset =3D
0, size =3D 4, value =3D 0x12345678), the contents of d->config will alwa=
ys
be identical, regardless of host endianness. The same holds true for
pci_config_read(offset =3D 0, size =3D 4). It will always return 0x123456=
78.

In your code, you swab that value again. I assume there's a reason
you're swapping it and that it's the way the architecture expects it
(mind to point me to the respective spec so I can verify?). But if the
architecture expects it, then it expects it regardless of host
endianness. The contents of regs[r1] should always be 0x78563412, no
matter whether we're in an LE or a BE environment.

Does that make sense now?


Alex