From: Alexander Graf <agraf@suse.de>
To: Frank Blaschka <blaschka@linux.vnet.ibm.com>
Cc: "peter.maydell@linaro.org" <peter.maydell@linaro.org>,
Frank Blaschka <frank.blaschka@de.ibm.com>,
"james.hogan@imgtec.com" <james.hogan@imgtec.com>,
"mtosatti@redhat.com" <mtosatti@redhat.com>,
"qemu-devel@nongnu.org" <qemu-devel@nongnu.org>,
"borntraeger@de.ibm.com" <borntraeger@de.ibm.com>,
"cornelia.huck@de.ibm.com" <cornelia.huck@de.ibm.com>,
"pbonzini@redhat.com" <pbonzini@redhat.com>,
"rth@twiddle.net" <rth@twiddle.net>
Subject: Re: [Qemu-devel] [PATCH 2/3] s390: implement pci instructions
Date: Wed, 12 Nov 2014 10:22:48 +0100 [thread overview]
Message-ID: <546326E8.1000909@suse.de> (raw)
In-Reply-To: <20141112091930.GA45236@tuxmaker.boeblingen.de.ibm.com>
On 12.11.14 10:19, Frank Blaschka wrote:
> On Wed, Nov 12, 2014 at 10:08:19AM +0100, Alexander Graf wrote:
>>
>>
>> On 12.11.14 09:49, Frank Blaschka wrote:
>>> On Tue, Nov 11, 2014 at 04:24:24PM +0100, Alexander Graf wrote:
>>>>
>>>>
>>>> On 11.11.14 15:08, Frank Blaschka wrote:
>>>>> On Tue, Nov 11, 2014 at 01:51:25PM +0100, Alexander Graf wrote:
>>>>>>
>>>>>>
>>>>>>
>>>>>>> Am 11.11.2014 um 13:39 schrieb Frank Blaschka <blaschka@linux.vnet.ibm.com>:
>>>>>>>
>>>>>>>> On Tue, Nov 11, 2014 at 01:16:04PM +0100, Alexander Graf wrote:
>>>>>>>>
>>>>>>>>
>>>>>>>>> On 11.11.14 13:10, Frank Blaschka wrote:
>>>>>>>>>> On Mon, Nov 10, 2014 at 04:56:21PM +0100, Alexander Graf wrote:
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>>> On 10.11.14 15:20, Frank Blaschka wrote:
>>>>>>>>>>> From: Frank Blaschka <frank.blaschka@de.ibm.com>
>>>>>>>>>>>
>>>>>>>>>>> This patch implements the s390 pci instructions in qemu. It allows
>>>>>>>>>>> to access and drive pci devices attached to the s390 pci bus.
>>>>>>>>>>> Because of platform constrains devices using IO BARs are not
>>>>>>>>>>> supported. Also a device has to support MSI/MSI-X to run on s390.
>>>>>>>>>>>
>>>>>>>>>>> Signed-off-by: Frank Blaschka <frank.blaschka@de.ibm.com>
>>>>>>>>>>> ---
>>>>>>>>>>> target-s390x/Makefile.objs | 2 +-
>>>>>>>>>>> target-s390x/kvm.c | 52 ++++
>>>>>>>>>>> target-s390x/pci_ic.c | 753 +++++++++++++++++++++++++++++++++++++++++++++
>>>>>>>>>>> target-s390x/pci_ic.h | 335 ++++++++++++++++++++
>>>>>>>>>>> 4 files changed, 1141 insertions(+), 1 deletion(-)
>>>>>>>>>>> create mode 100644 target-s390x/pci_ic.c
>>>>>>>>>>> create mode 100644 target-s390x/pci_ic.h
>>>>>>>>>>>
>>>>>>>>
>>>>>>>> [...]
>>>>>>>>
>>>>>>>>>>> +int kvm_pcilg_service_call(S390CPU *cpu, struct kvm_run *run)
>>>>>>>>>>> +{
>>>>>>>>>>> + CPUS390XState *env = &cpu->env;
>>>>>>>>>>> + S390PCIBusDevice *pbdev;
>>>>>>>>>>> + uint8_t r1 = (run->s390_sieic.ipb & 0x00f00000) >> 20;
>>>>>>>>>>> + uint8_t r2 = (run->s390_sieic.ipb & 0x000f0000) >> 16;
>>>>>>>>>>> + PciLgStg *rp;
>>>>>>>>>>> + uint64_t offset;
>>>>>>>>>>> + uint64_t data;
>>>>>>>>>>> + uint8_t len;
>>>>>>>>>>> +
>>>>>>>>>>> + cpu_synchronize_state(CPU(cpu));
>>>>>>>>>>> +
>>>>>>>>>>> + if (env->psw.mask & PSW_MASK_PSTATE) {
>>>>>>>>>>> + program_interrupt(env, PGM_PRIVILEGED, 4);
>>>>>>>>>>> + return 0;
>>>>>>>>>>> + }
>>>>>>>>>>> +
>>>>>>>>>>> + if (r2 & 0x1) {
>>>>>>>>>>> + program_interrupt(env, PGM_SPECIFICATION, 4);
>>>>>>>>>>> + return 0;
>>>>>>>>>>> + }
>>>>>>>>>>> +
>>>>>>>>>>> + rp = (PciLgStg *)&env->regs[r2];
>>>>>>>>>>> + offset = env->regs[r2 + 1];
>>>>>>>>>>> +
>>>>>>>>>>> + pbdev = s390_pci_find_dev_by_fh(rp->fh);
>>>>>>>>>>> + if (!pbdev) {
>>>>>>>>>>> + DPRINTF("pcilg no pci dev\n");
>>>>>>>>>>> + setcc(cpu, ZPCI_PCI_LS_INVAL_HANDLE);
>>>>>>>>>>> + return 0;
>>>>>>>>>>> + }
>>>>>>>>>>> +
>>>>>>>>>>> + len = rp->len & 0xF;
>>>>>>>>>>> + if (rp->pcias < 6) {
>>>>>>>>>>> + if ((8 - (offset & 0x7)) < len) {
>>>>>>>>>>> + program_interrupt(env, PGM_OPERAND, 4);
>>>>>>>>>>> + return 0;
>>>>>>>>>>> + }
>>>>>>>>>>> + MemoryRegion *mr = pbdev->pdev->io_regions[rp->pcias].memory;
>>>>>>>>>>> + io_mem_read(mr, offset, &data, len);
>>>>>>>>>>> + } else if (rp->pcias == 15) {
>>>>>>>>>>> + if ((4 - (offset & 0x3)) < len) {
>>>>>>>>>>> + program_interrupt(env, PGM_OPERAND, 4);
>>>>>>>>>>> + return 0;
>>>>>>>>>>> + }
>>>>>>>>>>> + data = pci_host_config_read_common(
>>>>>>>>>>> + pbdev->pdev, offset, pci_config_size(pbdev->pdev), len);
>>>>>>>>>>> +
>>>>>>>>>>> + switch (len) {
>>>>>>>>>>> + case 1:
>>>>>>>>>>> + break;
>>>>>>>>>>> + case 2:
>>>>>>>>>>> + data = cpu_to_le16(data);
>>>>>>>>>>> + break;
>>>>>>>>>>> + case 4:
>>>>>>>>>>> + data = cpu_to_le32(data);
>>>>>>>>>>> + break;
>>>>>>>>>>> + case 8:
>>>>>>>>>>> + data = cpu_to_le64(data);
>>>>>>>>>>> + break;
>>>>>>>>>>
>>>>>>>>>> Why? Also, this is wrong. cpu_to_le64 convert between host endianness
>>>>>>>>>> and LE. So if you're running this on an LE host, you won't swap the
>>>>>>>>>> value and get a broken result.
>>>>>>>>>>
>>>>>>>>>> If you know that the value is always swapped, use bswapxx().
>>>>>>>>>>
>>>>>>>>>
>>>>>>>>> Actually the code is right and required for a big endian host :-)
>>>>>>>>> pcilg/pcistg provide access to the PCI config space which is defined
>>>>>>>>> as PCI byte order (little endian). Since pci_host_config_read_common does
>>>>>>>>> already a le to cpu conversion we have to convert back to PCI byte order.
>>>>>>>>> Doing an unconditional swap would be a bug on a little endian host.
>>>>>>>>
>>>>>>>> Why would it be a bug? The value you end up writing is contents of a
>>>>>>>> register and thus doesn't have endianness. So if QEMU was an LE process,
>>>>>>>
>>>>>>> No, the s390 guest executing pcilg instruction expects to receive config space data
>>>>>>> in PCI byte order.
>>>>>>>
>>>>>>>> the value of data would be identical as on a BE QEMU before your swab.
>>>>>>>> After the swab, it would be bswap'ed on BE, but not LE. So LE hosts break.
>>>>>>>>
>>>>>>>
>>>>>>> Again on BE endian host we do the swap because of pci_host_config_read_common does
>>>>>>> read the value and do a byte swap for that value, but we need PCI byte order not BE here.
>>>>>>>
>>>>>>> On LE host pci_host_config_read_common does not do a byte swap so we do not have to
>>>>>>> convert back to PCI byte order.
>>>>>>
>>>>>> We maintain the PCI config space always in LE byte order in memory, that's why there is a bwap in its read function. The return result of the read function however is always the same, regardless of LE or BE host. If I do a read of size 4, I will always get 0x1, not 0x01000000 returned.
>>>>>>
>>>>>> So now you need to convert that 0x1 into a 0x01000000 manually here because some architect thought that registers have endianness (which they don't). But you need to do it always, even on an LE host, because the pci config space return value is identical on LE and BE.
>>>>>>
>>>>> so you tell me pci_host_config_read_common does not end up in pci_default_read_config?
>>>>>
>>>>> uint32_t pci_default_read_config(PCIDevice *d,
>>>>> uint32_t address, int len)
>>>>> {
>>>>> uint32_t val = 0;
>>>>>
>>>>> memcpy(&val, d->config + address, len);
>>>>> return le32_to_cpu(val);
>>>>> }
>>>>>
>>>>> What did I miss?
>>>>
>>>> That's exactly where you end up in - and it's there to convert from the
>>>> PCI config space backing storage to a native number.
>>>>
>>>> Imagine you write 0x12345678 at offset 0. Because PCI config space is
>>>> defined to be LE, in the PCI config space memory this gets stored as
>>>>
>>>> 78 56 34 12
>>>>
>>>> The reason we do the internal storage of the config space that way is
>>>> that it's (in some PCI implementations) legal to access with single byte
>>>> granularities. So you could do a pci_config_read(offset = 1) which
>>>> should return 0x56.
>>>>
>>>> However, that means we completely nullify any effect of host endianness
>>>> in the PCI config layer already. So if you do pci_config_write(offset =
>>>> 0, size = 4, value = 0x12345678), the contents of d->config will always
>>>> be identical, regardless of host endianness. The same holds true for
>>>> pci_config_read(offset = 0, size = 4). It will always return 0x12345678.
>>>>
>>>
>>> I understood this from the beginning and I completely agree to this.
>>>
>>>> In your code, you swab that value again. I assume there's a reason
>>>> you're swapping it and that it's the way the architecture expects it
>>>
>>> Yes, s390 pcilg architecture states:
>>> Data in the PCI configuration space are treated
>>> as being in little-endian byte ordering
>>>
>>>> (mind to point me to the respective spec so I can verify?). But if the
>>>> architecture expects it, then it expects it regardless of host
>>>> endianness. The contents of regs[r1] should always be 0x78563412, no
>>>> matter whether we're in an LE or a BE environment.
>>>>
>>>> Does that make sense now?
>>>>
>>> Absolutely lets make an example for qemu running on BE and LE
>>>
>>> byte order config space backing pci_default_read_config pcilg (with cpu_to_le)
>>> BE 0x78563412 0x12345678 0x78563412
>>> LE 0x78563412 0x78563412 0x78563412
>>
>> No, pci_default_read_config() always returns 0x12345678 because it
>> returns a register, not memory.
>>
>
> You mean implementation of pci_default_read_config is broken?
> If it should return a register it should not do "return le32_to_cpu(val);"
It has to, to convert from memory (after memcpy) to an actual register
value. Look at the value list in Paolo's email - I really have no idea
how to explain it any better.
Alex
next prev parent reply other threads:[~2014-11-12 9:23 UTC|newest]
Thread overview: 27+ messages / expand[flat|nested] mbox.gz Atom feed top
2014-11-10 14:20 [Qemu-devel] [PATCH 0/3] add PCI support for the s390 platform Frank Blaschka
2014-11-10 14:20 ` [Qemu-devel] [PATCH 1/3] s390: Add PCI bus support Frank Blaschka
2014-11-10 15:14 ` Alexander Graf
2014-11-18 12:50 ` Frank Blaschka
2014-11-18 17:00 ` Alexander Graf
2014-11-25 10:11 ` Frank Blaschka
2014-11-25 12:14 ` Alexander Graf
2014-11-25 12:43 ` Frank Blaschka
2014-11-10 14:20 ` [Qemu-devel] [PATCH 2/3] s390: implement pci instructions Frank Blaschka
2014-11-10 15:56 ` Alexander Graf
2014-11-11 12:10 ` Frank Blaschka
2014-11-11 12:16 ` Alexander Graf
2014-11-11 12:39 ` Frank Blaschka
2014-11-11 12:51 ` Alexander Graf
2014-11-11 14:08 ` Frank Blaschka
2014-11-11 15:24 ` Alexander Graf
2014-11-12 8:49 ` Frank Blaschka
2014-11-12 9:08 ` Alexander Graf
2014-11-12 9:11 ` Paolo Bonzini
2014-11-12 9:13 ` Alexander Graf
2014-11-12 9:19 ` Frank Blaschka
2014-11-12 9:22 ` Alexander Graf [this message]
2014-11-12 9:36 ` Paolo Bonzini
2014-11-12 14:34 ` Frank Blaschka
2014-11-11 12:17 ` Peter Maydell
2014-11-11 12:40 ` Frank Blaschka
2014-11-10 14:20 ` [Qemu-devel] [PATCH 3/3] kvm: extend kvm_irqchip_add_msi_route to work on s390 Frank Blaschka
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=546326E8.1000909@suse.de \
--to=agraf@suse.de \
--cc=blaschka@linux.vnet.ibm.com \
--cc=borntraeger@de.ibm.com \
--cc=cornelia.huck@de.ibm.com \
--cc=frank.blaschka@de.ibm.com \
--cc=james.hogan@imgtec.com \
--cc=mtosatti@redhat.com \
--cc=pbonzini@redhat.com \
--cc=peter.maydell@linaro.org \
--cc=qemu-devel@nongnu.org \
--cc=rth@twiddle.net \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).