All of lore.kernel.org
 help / color / mirror / Atom feed
From: christoffer.dall@linaro.org (Christoffer Dall)
To: linux-arm-kernel@lists.infradead.org
Subject: [PATCH REPOST 5/5] ARM: kvm MMIO support BE host running LE code
Date: Mon, 20 Jan 2014 17:19:15 -0800	[thread overview]
Message-ID: <20140121011915.GM13432@cbox> (raw)
In-Reply-To: <CAA3XUr1=yeO=jJrWZJvrdBWbfes8+dHqkGkccjked=Racz3y_w@mail.gmail.com>

On Mon, Jan 06, 2014 at 05:59:03PM -0800, Victor Kamensky wrote:
> On 6 January 2014 14:56, Christoffer Dall <christoffer.dall@linaro.org> wrote:
> > On Mon, Jan 06, 2014 at 10:31:42PM +0000, Peter Maydell wrote:
> >> On 6 January 2014 18:20, Marc Zyngier <marc.zyngier@arm.com> wrote:
> >> >  No matter how data is stored in memory (BE, LE, or
> >> > even PDP endianness), CPU registers always have a consistent
> >> > representation. They are immune to CPU endianness change, and storing
> >> > to/reading from memory won't change the value, as long as you use the
> >> > same endianness for writing/reading.
> >>
> >> Ah, endianness. This always confuses me, but I hope the following
> >> is correct... (in all the following when I say BE I mean BE8, not BE32,
> >> since BE32 and virtualization never occur in the same CPU).
> >>
> >> Certainly registers don't have endianness, but the entire point
> >> of the CPSR.E bit is exactly that it changes the value as it is
> >> stored to / read from memory, isn't it? -- that's where and when the
> >> byte-lane flipping happens.
> >>
> >> Where this impacts the hypervisor is that instead of actually sending
> >> the data out to the bus via the byte-swapping h/w, we've trapped instead.
> >> The hypervisor reads the original data directly from the guest CPU
> >> registers, and so it's the hypervisor and userspace support code that
> >> between them have to emulate the equivalent of the byte lane
> >> swapping h/w. You could argue that it shouldn't be the kernel's
> >> job, but since the kernel has to do it for the devices it emulates
> >> internally, I'm not sure that makes much sense.
> >
> > As far as I understand, this is exactly what vcpu_data_guest_to_host and
> > vcpu_data_host_to_guest do; emulate the byte lane swapping.
> >
> > The problem is that it only works on a little-endian host with the
> > current code, because be16_to_cpu (for example), actually perform a
> > byteswap, which is what needs to be emulated.  On a big-endian host, we
> > do nothing, so we end up giving a byteswapped value to the emulated
> > device.
> 
> Yes, that was my point on the thread: vcpu_data_guest_to_host and
> vcpu_data_host_to_guest functions for any given host endianity should
> give opposite endian results depending on CPSR E bit value. And
> currently it is not happening in BE host case. It seems that Peter and
> you agree with that and I gave example in another email with
> dynamically switching E bit illustrating this problem for BE host.
> 
> > I think a cleaner fix than this patch is to just change the
> > be16_to_cpu() to a __swab16() instead, which clearly indicates that
> > 'here is the byte lane swapping'.
> 
> Yes, that may work, but it is a bit orthogonal issue.

Why?  I don't think it is, I think it's addressing exactly the point at
hand.

> And I don't think
> it is better. For this to work one need to change canonical endianity on
> one of the sides around vcpu_data_guest_to_host and
> vcpu_data_host_to_guest functions.

You have to simply clearly define which format you want mmio.data to be
in.  This is a user space interface across multiple architectures and
therefore something you have to consider carefully and you're limited in
choices to something that works with existing user space code.

> 
> Changing  it on side that faces hypervisor (code that handles guest spilled
> CPU register set) does not make sense at all - if we will keep guest CPU
> register set in memory in LE form and hypervisor runs in BE (BE host),
> code that spills registers would need to do constant byteswaps. Also any
> access by host kernel and hypervisor (all running in BE) would need to do
> byteswaps while working with guest saved registers.
> 
> Changing canonical form of data on side that faces emulator and mmio
> part of kvm_run does not make sense either. kvm_run mmio.data field is
> bytes array, when it comes to host kernel from emulator, it already contains
> device memory in correct endian order that corresponds to endianity of
> emulated device. For example for LE device word read access, after call is
> emulated, mmio.data will contain mmio.data[0], mmio.data[1], mmio.data[2]
> mmio.data[3] values in LE order (mmio.data[3] is MSB). Now look at
> mmio_read_buf function introduced by Marc's 6d89d2d9 commit, this function
> will byte copy this mmio.data buffer into integer according to ongoing mmio
> access size. Note in BE host case such integer, in 'data' variable of
> kvm_handle_mmio_return function, will have byteswapped value. Now when it will
> be passed into vcpu_data_host_to_guest function, and it emulates read access
> of guest with E bit set, and if we follow your suggestion, it will be
> byteswapped.
> I.e 'data' integer will contain non byteswapped value of LE device. It will be
> further stored into some vcpu_reg register, still in native format (BE
> store), and
> further restored into guest CPU register, still non byteswapped (BE hypervisor).
> And that is not what BE client reading word of LE device expects - BE client
> knowing that it reads LE device with E bit set, it will issue additional rev
> instruction to get device memory as integer. If we really want to follow your
> suggestion, one may introduce compensatory byteswaps in mmio_read_buf
> and mmio_write_buf functions in case of BE host, rather then just do
> memcpy ... but I am not sure what it will buy us - in BE case it will swap data
> twice.
> 
> Note in above description by "canonical" I mean some form of data regardless
> of current access CPSR E value. But it may differ depending on host endianess.
> 

There's a lot of text to digest here, talking about a canonical form
here doesn't help; just define the layout of the destination byte array.
I also got completely lost in what you're referring to when you talk
about 'sides' here.

The thing we must decide is how the data is stored in
kvm_exit_mmio.data.  See Peter's recent thread "KVM and
variable-endianness guest CPUs".  Once we agree on this, the rest should
be easy (assuming we use the same structure for the data in the kernel's
internal kvm_exit_mmio declared on the stack in io_mem_abort()).

The format you suggest requires any consumer of this data to consider
the host endianness, which I don't think makes anything more clear (see
my comment on the vgic patch).

The in-kernel interface between the io_mem_abort() code and any
in-kernel emulated device must do exactly the same as the interface
between KVM and QEMU must do for KVM_EXIT_MMIO.

-- 
Christoffer

  reply	other threads:[~2014-01-21  1:19 UTC|newest]

Thread overview: 26+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2013-12-20 16:48 [PATCH REPOST 0/5] armv7 BE kvm support Victor Kamensky
2013-12-20 16:48 ` [PATCH REPOST 1/5] ARM: kvm: replace push and pop with stdmb and ldmia instrs to enable assembler.h inclusion Victor Kamensky
2014-01-21  1:18   ` Christoffer Dall
2014-01-21  9:58     ` Marc Zyngier
2014-01-22  6:41       ` Victor Kamensky
2014-01-22 10:22         ` Will Deacon
2013-12-20 16:48 ` [PATCH REPOST 2/5] ARM: fix KVM assembler files to work in BE case Victor Kamensky
2014-01-21  1:18   ` Christoffer Dall
2013-12-20 16:48 ` [PATCH REPOST 3/5] ARM: kvm one_reg coproc set and get BE fixes Victor Kamensky
2014-01-21  1:18   ` Christoffer Dall
2013-12-20 16:48 ` [PATCH REPOST 4/5] ARM: kvm vgic mmio should return data in BE format in BE case Victor Kamensky
2014-01-21  1:19   ` Christoffer Dall
2013-12-20 16:48 ` [PATCH REPOST 5/5] ARM: kvm MMIO support BE host running LE code Victor Kamensky
2014-01-06 12:37   ` Marc Zyngier
2014-01-06 17:44     ` Victor Kamensky
2014-01-06 18:20       ` Marc Zyngier
2014-01-06 19:55         ` Victor Kamensky
2014-01-06 22:31         ` Peter Maydell
2014-01-06 22:56           ` Christoffer Dall
2014-01-07  1:59             ` Victor Kamensky
2014-01-21  1:19               ` Christoffer Dall [this message]
2014-01-21  5:24                 ` Victor Kamensky
2014-01-21  5:46                   ` Anup Patel
2014-01-21  6:31                     ` Christoffer Dall
2014-01-21  6:39                     ` Victor Kamensky
2014-01-21  6:03                   ` Christoffer Dall

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20140121011915.GM13432@cbox \
    --to=christoffer.dall@linaro.org \
    --cc=linux-arm-kernel@lists.infradead.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.