From: Benjamin Herrenschmidt <benh@kernel.crashing.org>
To: Alex Williamson <alex.williamson@redhat.com>
Cc: kvm@vger.kernel.org,
Nikunj A Dadhania <nikunj@linux.vnet.ibm.com>,
Alexey Kardashevskiy <aik@ozlabs.ru>,
linux-kernel@vger.kernel.org, Alexander Graf <agraf@suse.de>,
linuxppc-dev@lists.ozlabs.org
Subject: Re: [PATCH] vfio: Fix endianness handling for emulated BARs
Date: Sat, 21 Jun 2014 09:12:41 +1000 [thread overview]
Message-ID: <1403305961.4587.66.camel@pasglop> (raw)
In-Reply-To: <1403234514.3707.278.camel@ul30vt.home>
On Thu, 2014-06-19 at 21:21 -0600, Alex Williamson wrote:
> Working on big endian being an accident may be a matter of perspective
:-)
> The comment remains that this patch doesn't actually fix anything except
> the overhead on big endian systems doing redundant byte swapping and
> maybe the philosophy that vfio regions are little endian.
Yes, that works by accident because technically VFIO is a transport and
thus shouldn't perform any endian swapping of any sort, which remains
the responsibility of the end driver which is the only one to know
whether a given BAR location is a a register or some streaming data
and in the former case whether it's LE or BE (some PCI devices are BE
even ! :-)
But yes, in the end, it works with the dual "cancelling" swaps and the
overhead of those swaps is probably drowned in the noise of the syscall
overhead.
> I'm still not a fan of iowrite vs iowritebe, there must be something we
> can use that doesn't have an implicit swap.
Sadly there isn't ... In the old day we didn't even have the "be"
variant and readl/writel style accessors still don't have them either
for all archs.
There is __raw_readl/writel but here the semantics are much more than
just "don't swap", they also don't have memory barriers (which means
they are essentially useless to most drivers unless those are platform
specific drivers which know exactly what they are doing, or in the rare
cases such as accessing a framebuffer which we know never have side
effects).
> Calling it iowrite*_native is also an abuse of the namespace.
> Next thing we know some common code
> will legitimately use that name.
I might make sense to those definitions into a common header. There have
been a handful of cases in the past that wanted that sort of "native
byte order" MMIOs iirc (though don't ask me for examples, I can't really
remember).
> If we do need to define an alias
> (which I'd like to avoid) it should be something like vfio_iowrite32.
> Thanks,
Cheers,
Ben.
> Alex
>
> > > ===
> > >
> > > any better?
> > >
> > >
> > >
> > >
> > >>>> Suggested-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
> > >>>> Signed-off-by: Alexey Kardashevskiy <aik@ozlabs.ru>
> > >>>> ---
> > >>>> drivers/vfio/pci/vfio_pci_rdwr.c | 20 ++++++++++++++++----
> > >>>> 1 file changed, 16 insertions(+), 4 deletions(-)
> > >>>>
> > >>>> diff --git a/drivers/vfio/pci/vfio_pci_rdwr.c b/drivers/vfio/pci/vfio_pci_rdwr.c
> > >>>> index 210db24..f363b5a 100644
> > >>>> --- a/drivers/vfio/pci/vfio_pci_rdwr.c
> > >>>> +++ b/drivers/vfio/pci/vfio_pci_rdwr.c
> > >>>> @@ -21,6 +21,18 @@
> > >>>>
> > >>>> #include "vfio_pci_private.h"
> > >>>>
> > >>>> +#ifdef __BIG_ENDIAN__
> > >>>> +#define ioread16_native ioread16be
> > >>>> +#define ioread32_native ioread32be
> > >>>> +#define iowrite16_native iowrite16be
> > >>>> +#define iowrite32_native iowrite32be
> > >>>> +#else
> > >>>> +#define ioread16_native ioread16
> > >>>> +#define ioread32_native ioread32
> > >>>> +#define iowrite16_native iowrite16
> > >>>> +#define iowrite32_native iowrite32
> > >>>> +#endif
> > >>>> +
> > >>>> /*
> > >>>> * Read or write from an __iomem region (MMIO or I/O port) with an excluded
> > >>>> * range which is inaccessible. The excluded range drops writes and fills
> > >>>> @@ -50,9 +62,9 @@ static ssize_t do_io_rw(void __iomem *io, char __user *buf,
> > >>>> if (copy_from_user(&val, buf, 4))
> > >>>> return -EFAULT;
> > >>>>
> > >>>> - iowrite32(le32_to_cpu(val), io + off);
> > >>>> + iowrite32_native(val, io + off);
> > >>>> } else {
> > >>>> - val = cpu_to_le32(ioread32(io + off));
> > >>>> + val = ioread32_native(io + off);
> > >>>>
> > >>>> if (copy_to_user(buf, &val, 4))
> > >>>> return -EFAULT;
> > >>>> @@ -66,9 +78,9 @@ static ssize_t do_io_rw(void __iomem *io, char __user *buf,
> > >>>> if (copy_from_user(&val, buf, 2))
> > >>>> return -EFAULT;
> > >>>>
> > >>>> - iowrite16(le16_to_cpu(val), io + off);
> > >>>> + iowrite16_native(val, io + off);
> > >>>> } else {
> > >>>> - val = cpu_to_le16(ioread16(io + off));
> > >>>> + val = ioread16_native(io + off);
> > >>>>
> > >>>> if (copy_to_user(buf, &val, 2))
> > >>>> return -EFAULT;
> > >>>
> > >>>
> > >>>
> > >>
> > >>
> > >
> > >
> >
> >
>
>
>
> --
> To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
> the body of a message to majordomo@vger.kernel.org
> More majordomo info at http://vger.kernel.org/majordomo-info.html
> Please read the FAQ at http://www.tux.org/lkml/
next prev parent reply other threads:[~2014-06-20 23:12 UTC|newest]
Thread overview: 26+ messages / expand[flat|nested] mbox.gz Atom feed top
2014-06-18 11:36 [PATCH] vfio: Fix endianness handling for emulated BARs Alexey Kardashevskiy
2014-06-18 18:35 ` Alex Williamson
2014-06-19 0:50 ` Alexey Kardashevskiy
2014-06-19 1:50 ` Alexey Kardashevskiy
2014-06-19 1:50 ` Alexey Kardashevskiy
2014-06-19 3:48 ` Alexey Kardashevskiy
2014-06-19 5:30 ` Bharat.Bhushan
2014-06-19 6:17 ` Alexey Kardashevskiy
2014-06-20 3:21 ` Alex Williamson
2014-06-20 14:14 ` Alexey Kardashevskiy
2014-06-20 23:16 ` Benjamin Herrenschmidt
2014-06-20 23:12 ` Benjamin Herrenschmidt [this message]
2014-06-24 10:11 ` Alexey Kardashevskiy
2014-06-24 10:41 ` Alexander Graf
2014-06-24 12:50 ` Alexey Kardashevskiy
2014-06-24 12:52 ` Alexander Graf
2014-06-24 13:01 ` Alexey Kardashevskiy
2014-06-24 13:22 ` Alexander Graf
2014-06-24 14:21 ` Alex Williamson
2014-06-24 14:33 ` Alexey Kardashevskiy
2014-06-24 14:40 ` David Laight
2014-06-24 14:43 ` Alex Williamson
2014-06-24 16:26 ` Alexey Kardashevskiy
2014-06-24 21:54 ` Benjamin Herrenschmidt
2014-06-25 2:43 ` Alexey Kardashevskiy
2014-06-24 21:46 ` Benjamin Herrenschmidt
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=1403305961.4587.66.camel@pasglop \
--to=benh@kernel.crashing.org \
--cc=agraf@suse.de \
--cc=aik@ozlabs.ru \
--cc=alex.williamson@redhat.com \
--cc=kvm@vger.kernel.org \
--cc=linux-kernel@vger.kernel.org \
--cc=linuxppc-dev@lists.ozlabs.org \
--cc=nikunj@linux.vnet.ibm.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).