* [RFC PATCH] KVM: Specify byte order for KVM_EXIT_MMIO
@ 2014-01-23 23:46 Christoffer Dall
2014-01-24 0:01 ` Peter Maydell
0 siblings, 1 reply; 9+ messages in thread
From: Christoffer Dall @ 2014-01-23 23:46 UTC (permalink / raw)
To: kvm
Cc: kvmarm, kvm-ppc, Christoffer Dall, Marc Zyngier, Peter Maydell,
Alexander Graf
The KVM API documentation is not clear about the semantics of the data
field on the mmio struct on the kvm_run struct.
This has become problematic when supporting ARM guests on big-endian
host systems with guests of both endianness types, because it is unclear
how the data should be exported to user space.
This should not break with existing implementations as all supported
existing implementations of known user space applications (QEMU and
kvmtools for virtio) only support default endianness of the
architectures on the host side.
Cc: Marc Zyngier <marc.zyngier@arm.com>
Cc: Peter Maydell <peter.maydell@linaro.org>
Cc: Alexander Graf <agraf@suse.de>
Signed-off-by: Christoffer Dall <christoffer.dall@linaro.org>
---
Documentation/virtual/kvm/api.txt | 5 +++++
1 file changed, 5 insertions(+)
diff --git a/Documentation/virtual/kvm/api.txt b/Documentation/virtual/kvm/api.txt
index 366bf4b..fb7c7e4 100644
--- a/Documentation/virtual/kvm/api.txt
+++ b/Documentation/virtual/kvm/api.txt
@@ -2565,6 +2565,11 @@ executed a memory-mapped I/O instruction which could not be satisfied
by kvm. The 'data' member contains the written data if 'is_write' is
true, and should be filled by application code otherwise.
+The 'data' member byte order is host kernel native endianness, regardless of
+the endianness of the guest, and represents the the value as it would go on the
+bus in real hardware. The host kernel should always be able to do:
+<type> val = *((<type> *)mmio.data).
+
NOTE: For KVM_EXIT_IO, KVM_EXIT_MMIO, KVM_EXIT_OSI, KVM_EXIT_DCR,
KVM_EXIT_PAPR and KVM_EXIT_EPR the corresponding
operations are complete (and guest state is consistent) only after userspace
--
1.8.5.2
^ permalink raw reply related [flat|nested] 9+ messages in thread* Re: [RFC PATCH] KVM: Specify byte order for KVM_EXIT_MMIO
2014-01-23 23:46 [RFC PATCH] KVM: Specify byte order for KVM_EXIT_MMIO Christoffer Dall
@ 2014-01-24 0:01 ` Peter Maydell
2014-01-24 13:09 ` Paolo Bonzini
0 siblings, 1 reply; 9+ messages in thread
From: Peter Maydell @ 2014-01-24 0:01 UTC (permalink / raw)
To: Christoffer Dall
Cc: kvm-devel, kvmarm@lists.cs.columbia.edu, kvm-ppc, Marc Zyngier,
Alexander Graf
On 23 January 2014 23:46, Christoffer Dall <christoffer.dall@linaro.org> wrote:
> The KVM API documentation is not clear about the semantics of the data
> field on the mmio struct on the kvm_run struct.
>
> This has become problematic when supporting ARM guests on big-endian
> host systems with guests of both endianness types, because it is unclear
> how the data should be exported to user space.
>
> This should not break with existing implementations as all supported
> existing implementations of known user space applications (QEMU and
> kvmtools for virtio) only support default endianness of the
> architectures on the host side.
>
> Cc: Marc Zyngier <marc.zyngier@arm.com>
> Cc: Peter Maydell <peter.maydell@linaro.org>
> Cc: Alexander Graf <agraf@suse.de>
> Signed-off-by: Christoffer Dall <christoffer.dall@linaro.org>
> ---
> Documentation/virtual/kvm/api.txt | 5 +++++
> 1 file changed, 5 insertions(+)
>
> diff --git a/Documentation/virtual/kvm/api.txt b/Documentation/virtual/kvm/api.txt
> index 366bf4b..fb7c7e4 100644
> --- a/Documentation/virtual/kvm/api.txt
> +++ b/Documentation/virtual/kvm/api.txt
> @@ -2565,6 +2565,11 @@ executed a memory-mapped I/O instruction which could not be satisfied
> by kvm. The 'data' member contains the written data if 'is_write' is
> true, and should be filled by application code otherwise.
>
> +The 'data' member byte order is host kernel native endianness, regardless of
> +the endianness of the guest, and represents the the value as it would go on the
> +bus in real hardware. The host kernel should always be able to do:
> +<type> val = *((<type> *)mmio.data).
I think this would be better phrased as "The host userspace should always",
since this documentation is supposed to be telling userspace what the
kernel's contract with it is, not the kernel keeping notes for itself on
its own implementation. (It also clarifies what the intention is for the
obscure and maybe-we'll-never-implement-this case of an LE host
kernel using a compatibility interface to run the host userspace (QEMU)
as a BE process which sees the same ABI a BE kernel provides,
without actually dragging that red herring explicitly into the documentation.)
On the general semantics I am entirely in agreement with the clarification.
thanks
-- PMM
^ permalink raw reply [flat|nested] 9+ messages in thread
* Re: [RFC PATCH] KVM: Specify byte order for KVM_EXIT_MMIO
2014-01-24 0:01 ` Peter Maydell
@ 2014-01-24 13:09 ` Paolo Bonzini
2014-01-24 13:13 ` Alexander Graf
0 siblings, 1 reply; 9+ messages in thread
From: Paolo Bonzini @ 2014-01-24 13:09 UTC (permalink / raw)
To: Peter Maydell, Christoffer Dall
Cc: kvm-devel, kvmarm@lists.cs.columbia.edu, kvm-ppc, Marc Zyngier,
Alexander Graf
Il 24/01/2014 01:01, Peter Maydell ha scritto:
>> >
>> > +The 'data' member byte order is host kernel native endianness, regardless of
>> > +the endianness of the guest, and represents the the value as it would go on the
>> > +bus in real hardware. The host kernel should always be able to do:
>> > +<type> val = *((<type> *)mmio.data).
> I think this would be better phrased as "The host userspace should always",
> since this documentation is supposed to be telling userspace what the
> kernel's contract with it is, not the kernel keeping notes for itself on
> its own implementation. (It also clarifies what the intention is for the
> obscure and maybe-we'll-never-implement-this case of an LE host
> kernel using a compatibility interface to run the host userspace (QEMU)
> as a BE process which sees the same ABI a BE kernel provides,
> without actually dragging that red herring explicitly into the documentation.)
I agree, and also the first line should mention userspace.
In PPC I think it's possible or even common to have BE host kernel and
LE host userspace (or perhaps vice versa is the common one).
Paolo
^ permalink raw reply [flat|nested] 9+ messages in thread
* Re: [RFC PATCH] KVM: Specify byte order for KVM_EXIT_MMIO
2014-01-24 13:09 ` Paolo Bonzini
@ 2014-01-24 13:13 ` Alexander Graf
2014-01-24 15:23 ` Victor Kamensky
2014-01-24 19:17 ` Victor Kamensky
0 siblings, 2 replies; 9+ messages in thread
From: Alexander Graf @ 2014-01-24 13:13 UTC (permalink / raw)
To: Paolo Bonzini
Cc: Peter Maydell, Christoffer Dall, kvm-devel,
kvmarm@lists.cs.columbia.edu, kvm-ppc, Marc Zyngier
On 24.01.2014, at 14:09, Paolo Bonzini <pbonzini@redhat.com> wrote:
> Il 24/01/2014 01:01, Peter Maydell ha scritto:
>>> >
>>> > +The 'data' member byte order is host kernel native endianness, regardless of
>>> > +the endianness of the guest, and represents the the value as it would go on the
>>> > +bus in real hardware. The host kernel should always be able to do:
>>> > +<type> val = *((<type> *)mmio.data).
>> I think this would be better phrased as "The host userspace should always",
>> since this documentation is supposed to be telling userspace what the
>> kernel's contract with it is, not the kernel keeping notes for itself on
>> its own implementation. (It also clarifies what the intention is for the
>> obscure and maybe-we'll-never-implement-this case of an LE host
>> kernel using a compatibility interface to run the host userspace (QEMU)
>> as a BE process which sees the same ABI a BE kernel provides,
>> without actually dragging that red herring explicitly into the documentation.)
>
> I agree, and also the first line should mention userspace.
>
> In PPC I think it's possible or even common to have BE host kernel and LE host userspace (or perhaps vice versa is the common one).
It was possible on 32bit, but I'm not sure anyone's actively using it :). The thing that was very common (not so much anymore for enterprise distros) is 32-bit user space with 64-bit kernels.
Alex
^ permalink raw reply [flat|nested] 9+ messages in thread
* Re: [RFC PATCH] KVM: Specify byte order for KVM_EXIT_MMIO
2014-01-24 13:13 ` Alexander Graf
@ 2014-01-24 15:23 ` Victor Kamensky
2014-01-24 15:32 ` Paolo Bonzini
2014-01-24 16:34 ` Victor Kamensky
2014-01-24 19:17 ` Victor Kamensky
1 sibling, 2 replies; 9+ messages in thread
From: Victor Kamensky @ 2014-01-24 15:23 UTC (permalink / raw)
To: Alexander Graf, Christoffer Dall
Cc: Paolo Bonzini, kvm-devel, kvm-ppc, kvmarm@lists.cs.columbia.edu
On 24 January 2014 05:13, Alexander Graf <agraf@suse.de> wrote:
>
> On 24.01.2014, at 14:09, Paolo Bonzini <pbonzini@redhat.com> wrote:
>
>> Il 24/01/2014 01:01, Peter Maydell ha scritto:
>>>> >
>>>> > +The 'data' member byte order is host kernel native endianness, regardless of
>>>> > +the endianness of the guest, and represents the the value as it would go on the
>>>> > +bus in real hardware.
Also if you use ints on real bus as description, you may want to clarify
restrictions on mmio.len. Basically on 32 bit platform (i.e like V7
ARM) one cannot have mmio.len=8, because one cannot have 64bit
value on 32bit data bus. Without such clarification introduction of
text like "the value as it would go on the bus in real hardware" is
confusing for len=8 for emulated CPUs where real busses are
32bit.
If ldrd/strd would be emulated on ARMV7 one would need to use
mmio twice to pass required data in either direction using len=4 ..
Thanks,
Victor
> The host kernel should always be able to do:
>>>> > +<type> val = *((<type> *)mmio.data).
>>> I think this would be better phrased as "The host userspace should always",
>>> since this documentation is supposed to be telling userspace what the
>>> kernel's contract with it is, not the kernel keeping notes for itself on
>>> its own implementation. (It also clarifies what the intention is for the
>>> obscure and maybe-we'll-never-implement-this case of an LE host
>>> kernel using a compatibility interface to run the host userspace (QEMU)
>>> as a BE process which sees the same ABI a BE kernel provides,
>>> without actually dragging that red herring explicitly into the documentation.)
>>
>> I agree, and also the first line should mention userspace.
>>
>> In PPC I think it's possible or even common to have BE host kernel and LE host userspace (or perhaps vice versa is the common one).
>
> It was possible on 32bit, but I'm not sure anyone's actively using it :). The thing that was very common (not so much anymore for enterprise distros) is 32-bit user space with 64-bit kernels.
>
>
> Alex
>
>
> _______________________________________________
> kvmarm mailing list
> kvmarm@lists.cs.columbia.edu
> https://lists.cs.columbia.edu/cucslists/listinfo/kvmarm
^ permalink raw reply [flat|nested] 9+ messages in thread
* Re: [RFC PATCH] KVM: Specify byte order for KVM_EXIT_MMIO
2014-01-24 15:23 ` Victor Kamensky
@ 2014-01-24 15:32 ` Paolo Bonzini
2014-01-24 16:19 ` Victor Kamensky
2014-01-24 16:34 ` Victor Kamensky
1 sibling, 1 reply; 9+ messages in thread
From: Paolo Bonzini @ 2014-01-24 15:32 UTC (permalink / raw)
To: Victor Kamensky, Alexander Graf, Christoffer Dall
Cc: kvm-devel, kvm-ppc, kvmarm@lists.cs.columbia.edu
Il 24/01/2014 16:23, Victor Kamensky ha scritto:
> Also if you use ints on real bus as description, you may want to clarify
> restrictions on mmio.len. Basically on 32 bit platform (i.e like V7
> ARM) one cannot have mmio.len=8, because one cannot have 64bit
> value on 32bit data bus. Without such clarification introduction of
> text like "the value as it would go on the bus in real hardware" is
> confusing for len=8 for emulated CPUs where real busses are
> 32bit.
This is not necessarily true. On a 32-bit CPU you can have a 64-bit
memory bus. Even x86 32-bit CPUs can do 64-bit MMIO via MMX or SSE or
double-word compare-and-swap (CMPXCHG8B).
Paolo
^ permalink raw reply [flat|nested] 9+ messages in thread
* Re: [RFC PATCH] KVM: Specify byte order for KVM_EXIT_MMIO
2014-01-24 15:32 ` Paolo Bonzini
@ 2014-01-24 16:19 ` Victor Kamensky
0 siblings, 0 replies; 9+ messages in thread
From: Victor Kamensky @ 2014-01-24 16:19 UTC (permalink / raw)
To: Paolo Bonzini
Cc: Alexander Graf, Christoffer Dall, kvm-devel, kvm-ppc,
kvmarm@lists.cs.columbia.edu
On 24 January 2014 07:32, Paolo Bonzini <pbonzini@redhat.com> wrote:
> Il 24/01/2014 16:23, Victor Kamensky ha scritto:
>
>> Also if you use ints on real bus as description, you may want to clarify
>> restrictions on mmio.len. Basically on 32 bit platform (i.e like V7
>> ARM) one cannot have mmio.len=8, because one cannot have 64bit
>> value on 32bit data bus. Without such clarification introduction of
>> text like "the value as it would go on the bus in real hardware" is
>> confusing for len=8 for emulated CPUs where real busses are
>> 32bit.
>
>
> This is not necessarily true. On a 32-bit CPU you can have a 64-bit memory
> bus. Even x86 32-bit CPUs can do 64-bit MMIO via MMX or SSE or double-word
> compare-and-swap (CMPXCHG8B).
Sure, that was part of my point :). But now text says about "real
hardware" buses and in any given moment emulator specify particular
type of CPU emulated, and I am quite sure that we can find one
emulated ARMV7 CPU that would just have real 32bit data bus.
Note there was no such problem and nobody cares what real
data buses width are, with definition I argued for. That I think
that was original intent of '__u8 data[8]' use - just bytes array
description of how memory at given phys_addr looks before
(mmiois_write=0) or would look after (mmio.is_write = 1)
requested memory transaction (or several of them for that
matter) are emulated by KVM_EXIT_MMIO. When if comes
to devices memory read/write it is logical view of memory of
course. Such definition works with any "real" data buses sizes,
starting with byte.
But, ooh, well ... nobody understands such definition .. I
stand failed to explain it clear enough.
Thanks,
Victor
> Paolo
^ permalink raw reply [flat|nested] 9+ messages in thread
* Re: [RFC PATCH] KVM: Specify byte order for KVM_EXIT_MMIO
2014-01-24 15:23 ` Victor Kamensky
2014-01-24 15:32 ` Paolo Bonzini
@ 2014-01-24 16:34 ` Victor Kamensky
1 sibling, 0 replies; 9+ messages in thread
From: Victor Kamensky @ 2014-01-24 16:34 UTC (permalink / raw)
To: Alexander Graf, Christoffer Dall
Cc: Paolo Bonzini, kvm-devel, kvm-ppc, kvmarm@lists.cs.columbia.edu
On 24 January 2014 07:23, Victor Kamensky <victor.kamensky@linaro.org> wrote:
> On 24 January 2014 05:13, Alexander Graf <agraf@suse.de> wrote:
>>
>> On 24.01.2014, at 14:09, Paolo Bonzini <pbonzini@redhat.com> wrote:
>>
>>> Il 24/01/2014 01:01, Peter Maydell ha scritto:
>>>>> >
>>>>> > +The 'data' member byte order is host kernel native endianness, regardless of
>>>>> > +the endianness of the guest, and represents the the value as it would go on the
>>>>> > +bus in real hardware.
Other thing with mmio.len need clarification: please specify what
mmio.len values are allowed. It seems that it was implied that len
could be only power of 2 in range between 1 and 8. I would rather see
that it is spelled out. I think code on all sides of KVM_EXIT_MMIO
already make such assumption. Now when you talk about integers
values on "bus in real hardware" power of 2 sizes implied Also note
for memory pieces with sizes other than 2, 4, 8 endianness is not
defined.
>
> Also if you use ints on real bus as description, you may want to clarify
> restrictions on mmio.len. Basically on 32 bit platform (i.e like V7
> ARM) one cannot have mmio.len=8, because one cannot have 64bit
> value on 32bit data bus. Without such clarification introduction of
> text like "the value as it would go on the bus in real hardware" is
> confusing for len=8 for emulated CPUs where real busses are
> 32bit.
>
> If ldrd/strd would be emulated on ARMV7 one would need to use
> mmio twice to pass required data in either direction using len=4 ..
>
> Thanks,
> Victor
>
>> The host kernel should always be able to do:
>>>>> > +<type> val = *((<type> *)mmio.data).
would it be prudent to say that it is just integer <type> here.
Thanks,
Victor
>>>> I think this would be better phrased as "The host userspace should always",
>>>> since this documentation is supposed to be telling userspace what the
>>>> kernel's contract with it is, not the kernel keeping notes for itself on
>>>> its own implementation. (It also clarifies what the intention is for the
>>>> obscure and maybe-we'll-never-implement-this case of an LE host
>>>> kernel using a compatibility interface to run the host userspace (QEMU)
>>>> as a BE process which sees the same ABI a BE kernel provides,
>>>> without actually dragging that red herring explicitly into the documentation.)
>>>
>>> I agree, and also the first line should mention userspace.
>>>
>>> In PPC I think it's possible or even common to have BE host kernel and LE host userspace (or perhaps vice versa is the common one).
>>
>> It was possible on 32bit, but I'm not sure anyone's actively using it :). The thing that was very common (not so much anymore for enterprise distros) is 32-bit user space with 64-bit kernels.
>>
>>
>> Alex
>>
>>
>> _______________________________________________
>> kvmarm mailing list
>> kvmarm@lists.cs.columbia.edu
>> https://lists.cs.columbia.edu/cucslists/listinfo/kvmarm
^ permalink raw reply [flat|nested] 9+ messages in thread
* Re: [RFC PATCH] KVM: Specify byte order for KVM_EXIT_MMIO
2014-01-24 13:13 ` Alexander Graf
2014-01-24 15:23 ` Victor Kamensky
@ 2014-01-24 19:17 ` Victor Kamensky
1 sibling, 0 replies; 9+ messages in thread
From: Victor Kamensky @ 2014-01-24 19:17 UTC (permalink / raw)
To: Alexander Graf, Paolo Bonzini, Christoffer Dall, Peter Maydell
Cc: kvm-devel, kvm-ppc, kvmarm@lists.cs.columbia.edu
On 24 January 2014 05:13, Alexander Graf <agraf@suse.de> wrote:
>
> On 24.01.2014, at 14:09, Paolo Bonzini <pbonzini@redhat.com> wrote:
>
>> Il 24/01/2014 01:01, Peter Maydell ha scritto:
>>>> >
>>>> > +The 'data' member byte order is host kernel native endianness, regardless of
>>>> > +the endianness of the guest, and represents the the value as it would go on the
>>>> > +bus in real hardware. The host kernel should always be able to do:
>>>> > +<type> val = *((<type> *)mmio.data).
>>> I think this would be better phrased as "The host userspace should always",
>>> since this documentation is supposed to be telling userspace what the
>>> kernel's contract with it is, not the kernel keeping notes for itself on
>>> its own implementation. (It also clarifies what the intention is for the
>>> obscure and maybe-we'll-never-implement-this case of an LE host
>>> kernel using a compatibility interface to run the host userspace (QEMU)
>>> as a BE process which sees the same ABI a BE kernel provides,
>>> without actually dragging that red herring explicitly into the documentation.)
>>
>> I agree, and also the first line should mention userspace.
>>
>> In PPC I think it's possible or even common to have BE host kernel and LE host userspace (or perhaps vice versa is the common one).
>
> It was possible on 32bit, but I'm not sure anyone's actively using it :). The thing that was very common (not so much anymore for enterprise distros) is 32-bit user space with 64-bit kernels.
Paolo, Alex, good point about BE kernel / LE user-land mix!
How KVM kernel code that deals with KVM_MMIO_EXIT can find
out what is user process endianity that handles this
KVM_MMIO_EXIT? Do we have kernel function that can tell
that. "Hey, what is current user-land task endianity? :)".
And reverse case, how user-land code that wants to do
KVM_MMIO_EXIT can find out what is endianity of kernel?
Do we have such system call? "Hey, kernel tell me your
endianity :)".
Just a thought: should not we instead of trying to implicitly
setup endianity by some other side properties like emulator or
kernel endianity, let's just do it explicitly. Adding 'endianness'
field into current mmio structure is not an option, but maybe
there is outside mechanism like KVM features, special ioctl
number, through which one can explicitly either set of learn
endianity in mmio.data[] for given KVM session.
At least, if we don't want to consider mixed BE/LE kernel/user-land,
then document should clarify that BE/LE kernel/user-land mix is not
supported and assumption here is that they always coincide.
Thanks,
Victor
>
> Alex
>
>
> _______________________________________________
> kvmarm mailing list
> kvmarm@lists.cs.columbia.edu
> https://lists.cs.columbia.edu/cucslists/listinfo/kvmarm
^ permalink raw reply [flat|nested] 9+ messages in thread
end of thread, other threads:[~2014-01-24 19:17 UTC | newest]
Thread overview: 9+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2014-01-23 23:46 [RFC PATCH] KVM: Specify byte order for KVM_EXIT_MMIO Christoffer Dall
2014-01-24 0:01 ` Peter Maydell
2014-01-24 13:09 ` Paolo Bonzini
2014-01-24 13:13 ` Alexander Graf
2014-01-24 15:23 ` Victor Kamensky
2014-01-24 15:32 ` Paolo Bonzini
2014-01-24 16:19 ` Victor Kamensky
2014-01-24 16:34 ` Victor Kamensky
2014-01-24 19:17 ` Victor Kamensky
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox