From: christoffer.dall@linaro.org (Christoffer Dall)
To: linux-arm-kernel@lists.infradead.org
Subject: [PATCH REPOST 5/5] ARM: kvm MMIO support BE host running LE code
Date: Mon, 20 Jan 2014 17:19:15 -0800 [thread overview]
Message-ID: <20140121011915.GM13432@cbox> (raw)
In-Reply-To: <CAA3XUr1=yeO=jJrWZJvrdBWbfes8+dHqkGkccjked=Racz3y_w@mail.gmail.com>
On Mon, Jan 06, 2014 at 05:59:03PM -0800, Victor Kamensky wrote:
> On 6 January 2014 14:56, Christoffer Dall <christoffer.dall@linaro.org> wrote:
> > On Mon, Jan 06, 2014 at 10:31:42PM +0000, Peter Maydell wrote:
> >> On 6 January 2014 18:20, Marc Zyngier <marc.zyngier@arm.com> wrote:
> >> > No matter how data is stored in memory (BE, LE, or
> >> > even PDP endianness), CPU registers always have a consistent
> >> > representation. They are immune to CPU endianness change, and storing
> >> > to/reading from memory won't change the value, as long as you use the
> >> > same endianness for writing/reading.
> >>
> >> Ah, endianness. This always confuses me, but I hope the following
> >> is correct... (in all the following when I say BE I mean BE8, not BE32,
> >> since BE32 and virtualization never occur in the same CPU).
> >>
> >> Certainly registers don't have endianness, but the entire point
> >> of the CPSR.E bit is exactly that it changes the value as it is
> >> stored to / read from memory, isn't it? -- that's where and when the
> >> byte-lane flipping happens.
> >>
> >> Where this impacts the hypervisor is that instead of actually sending
> >> the data out to the bus via the byte-swapping h/w, we've trapped instead.
> >> The hypervisor reads the original data directly from the guest CPU
> >> registers, and so it's the hypervisor and userspace support code that
> >> between them have to emulate the equivalent of the byte lane
> >> swapping h/w. You could argue that it shouldn't be the kernel's
> >> job, but since the kernel has to do it for the devices it emulates
> >> internally, I'm not sure that makes much sense.
> >
> > As far as I understand, this is exactly what vcpu_data_guest_to_host and
> > vcpu_data_host_to_guest do; emulate the byte lane swapping.
> >
> > The problem is that it only works on a little-endian host with the
> > current code, because be16_to_cpu (for example), actually perform a
> > byteswap, which is what needs to be emulated. On a big-endian host, we
> > do nothing, so we end up giving a byteswapped value to the emulated
> > device.
>
> Yes, that was my point on the thread: vcpu_data_guest_to_host and
> vcpu_data_host_to_guest functions for any given host endianity should
> give opposite endian results depending on CPSR E bit value. And
> currently it is not happening in BE host case. It seems that Peter and
> you agree with that and I gave example in another email with
> dynamically switching E bit illustrating this problem for BE host.
>
> > I think a cleaner fix than this patch is to just change the
> > be16_to_cpu() to a __swab16() instead, which clearly indicates that
> > 'here is the byte lane swapping'.
>
> Yes, that may work, but it is a bit orthogonal issue.
Why? I don't think it is, I think it's addressing exactly the point at
hand.
> And I don't think
> it is better. For this to work one need to change canonical endianity on
> one of the sides around vcpu_data_guest_to_host and
> vcpu_data_host_to_guest functions.
You have to simply clearly define which format you want mmio.data to be
in. This is a user space interface across multiple architectures and
therefore something you have to consider carefully and you're limited in
choices to something that works with existing user space code.
>
> Changing it on side that faces hypervisor (code that handles guest spilled
> CPU register set) does not make sense at all - if we will keep guest CPU
> register set in memory in LE form and hypervisor runs in BE (BE host),
> code that spills registers would need to do constant byteswaps. Also any
> access by host kernel and hypervisor (all running in BE) would need to do
> byteswaps while working with guest saved registers.
>
> Changing canonical form of data on side that faces emulator and mmio
> part of kvm_run does not make sense either. kvm_run mmio.data field is
> bytes array, when it comes to host kernel from emulator, it already contains
> device memory in correct endian order that corresponds to endianity of
> emulated device. For example for LE device word read access, after call is
> emulated, mmio.data will contain mmio.data[0], mmio.data[1], mmio.data[2]
> mmio.data[3] values in LE order (mmio.data[3] is MSB). Now look at
> mmio_read_buf function introduced by Marc's 6d89d2d9 commit, this function
> will byte copy this mmio.data buffer into integer according to ongoing mmio
> access size. Note in BE host case such integer, in 'data' variable of
> kvm_handle_mmio_return function, will have byteswapped value. Now when it will
> be passed into vcpu_data_host_to_guest function, and it emulates read access
> of guest with E bit set, and if we follow your suggestion, it will be
> byteswapped.
> I.e 'data' integer will contain non byteswapped value of LE device. It will be
> further stored into some vcpu_reg register, still in native format (BE
> store), and
> further restored into guest CPU register, still non byteswapped (BE hypervisor).
> And that is not what BE client reading word of LE device expects - BE client
> knowing that it reads LE device with E bit set, it will issue additional rev
> instruction to get device memory as integer. If we really want to follow your
> suggestion, one may introduce compensatory byteswaps in mmio_read_buf
> and mmio_write_buf functions in case of BE host, rather then just do
> memcpy ... but I am not sure what it will buy us - in BE case it will swap data
> twice.
>
> Note in above description by "canonical" I mean some form of data regardless
> of current access CPSR E value. But it may differ depending on host endianess.
>
There's a lot of text to digest here, talking about a canonical form
here doesn't help; just define the layout of the destination byte array.
I also got completely lost in what you're referring to when you talk
about 'sides' here.
The thing we must decide is how the data is stored in
kvm_exit_mmio.data. See Peter's recent thread "KVM and
variable-endianness guest CPUs". Once we agree on this, the rest should
be easy (assuming we use the same structure for the data in the kernel's
internal kvm_exit_mmio declared on the stack in io_mem_abort()).
The format you suggest requires any consumer of this data to consider
the host endianness, which I don't think makes anything more clear (see
my comment on the vgic patch).
The in-kernel interface between the io_mem_abort() code and any
in-kernel emulated device must do exactly the same as the interface
between KVM and QEMU must do for KVM_EXIT_MMIO.
--
Christoffer
next prev parent reply other threads:[~2014-01-21 1:19 UTC|newest]
Thread overview: 26+ messages / expand[flat|nested] mbox.gz Atom feed top
2013-12-20 16:48 [PATCH REPOST 0/5] armv7 BE kvm support Victor Kamensky
2013-12-20 16:48 ` [PATCH REPOST 1/5] ARM: kvm: replace push and pop with stdmb and ldmia instrs to enable assembler.h inclusion Victor Kamensky
2014-01-21 1:18 ` Christoffer Dall
2014-01-21 9:58 ` Marc Zyngier
2014-01-22 6:41 ` Victor Kamensky
2014-01-22 10:22 ` Will Deacon
2013-12-20 16:48 ` [PATCH REPOST 2/5] ARM: fix KVM assembler files to work in BE case Victor Kamensky
2014-01-21 1:18 ` Christoffer Dall
2013-12-20 16:48 ` [PATCH REPOST 3/5] ARM: kvm one_reg coproc set and get BE fixes Victor Kamensky
2014-01-21 1:18 ` Christoffer Dall
2013-12-20 16:48 ` [PATCH REPOST 4/5] ARM: kvm vgic mmio should return data in BE format in BE case Victor Kamensky
2014-01-21 1:19 ` Christoffer Dall
2013-12-20 16:48 ` [PATCH REPOST 5/5] ARM: kvm MMIO support BE host running LE code Victor Kamensky
2014-01-06 12:37 ` Marc Zyngier
2014-01-06 17:44 ` Victor Kamensky
2014-01-06 18:20 ` Marc Zyngier
2014-01-06 19:55 ` Victor Kamensky
2014-01-06 22:31 ` Peter Maydell
2014-01-06 22:56 ` Christoffer Dall
2014-01-07 1:59 ` Victor Kamensky
2014-01-21 1:19 ` Christoffer Dall [this message]
2014-01-21 5:24 ` Victor Kamensky
2014-01-21 5:46 ` Anup Patel
2014-01-21 6:31 ` Christoffer Dall
2014-01-21 6:39 ` Victor Kamensky
2014-01-21 6:03 ` Christoffer Dall
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20140121011915.GM13432@cbox \
--to=christoffer.dall@linaro.org \
--cc=linux-arm-kernel@lists.infradead.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox