From: Alexey Kardashevskiy <aik@ozlabs.ru>
To: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Cc: linuxppc-dev@lists.ozlabs.org,
Alex Williamson <alex.williamson@redhat.com>,
Paul Mackerras <paulus@samba.org>,
David Gibson <david@gibson.dropbear.id.au>
Subject: Re: [PATCH] powerpc-powernv: align BARs to PAGE_SIZE on powernv platform
Date: Wed, 05 Sep 2012 10:55:13 +1000 [thread overview]
Message-ID: <5046A2F1.1020004@ozlabs.ru> (raw)
In-Reply-To: <1346787940.3025.11.camel@pasglop>
On 05/09/12 05:45, Benjamin Herrenschmidt wrote:
> On Tue, 2012-09-04 at 17:36 +1000, Alexey Kardashevskiy wrote:
>> VFIO adds a separate memory region for every BAR and tries
>> to mmap() it to provide direct BAR mapping to the guest.
>> If it succeedes, QEMU registers this address with kvm_set_phys_mem().
>> However it is not always possible because such a BAR should
>> be host page size aligned. In this case VFIO uses "slow" path
>> and emulated BAR access in QEMU.
>>
>> In order to avoid "slow" path, BARs have to be PAGE_SIZE aligned
>> in the host kernel and this is what the patch does.
>>
>> The patch adds powernv platform specific hook which makes all
>> BARs sizes 64K aligned. The pci_reassigndev_resource_alignment()
>> function from drivers/pci/pci.c has been used as a reference.
>>
>> This is purely an optimization patch, the things will work without
>> it, just a bit slower.
>
> It's still bad in more ways that I care to explain...
Well it is right before pci_reassigndev_resource_alignment() which is
common and does the same thing.
> The main one is that you do the "fixup" in a very wrong place anyway and
> it might cause cases of overlapping BARs.
As far as I can tell it may only happen if someone tries to align resource
via kernel command line.
But ok. I trust you :)
> In any case this is wrong. It's a VFIO design bug and needs to be fixed
> there (CC'ing Alex).
It can be fixed in VFIO only if VFIO will stop treating functions
separately and start mapping group's MMIO space as a whole thing. But this
is not going to happen.
The example of the problem is NEC USB PCI which has 3 functions, each has
one BAR, these BARs are 4K aligned and I cannot see how it can be fixed
with 64K page size and VFIO creating memory regions per BAR (not per PHB).
> IE. We need a way to know where the BAR is within a page at which point
> VFIO can still map the page, but can also properly take into account the
> offset.
It is not about VFIO, it is about KVM. I cannot put non-aligned page to
kvm_set_phys_mem(). Cannot understand how we would solve this.
You better discuss it with David, my vocab is weak.
> We also need a way to tell VFIO userspace that it's OK to use the fast
> path for such small BARs. It's not for all host platforms. We know it's
> ok for PowerNV because we know the devices are grouped by PEs and the PE
> granularity is larger than a page but that's not necessarily going to be
> the case on all powerpc platforms that support KVM.
>
> Cheers,
> Ben.
>
>> Signed-off-by: Alexey Kardashevskiy <aik@ozlabs.ru>
>> ---
>> arch/powerpc/platforms/powernv/setup.c | 26 ++++++++++++++++++++++++++
>> 1 file changed, 26 insertions(+)
>>
>> diff --git a/arch/powerpc/platforms/powernv/setup.c b/arch/powerpc/platforms/powernv/setup.c
>> index db1ad1c..331838e 100644
>> --- a/arch/powerpc/platforms/powernv/setup.c
>> +++ b/arch/powerpc/platforms/powernv/setup.c
>> @@ -25,6 +25,7 @@
>> #include <linux/of.h>
>> #include <linux/interrupt.h>
>> #include <linux/bug.h>
>> +#include <linux/pci.h>
>>
>> #include <asm/machdep.h>
>> #include <asm/firmware.h>
>> @@ -179,6 +180,30 @@ static int __init pnv_probe(void)
>> return 1;
>> }
>>
>> +static void pnv_pcibios_fixup_resources(struct pci_dev *pdev)
>> +{
>> + struct resource *r;
>> + int i;
>> +
>> + /*
>> + * Aligning resources to PAGE_SIZE in order to
>> + * support "fast" path for PCI BAR access under VFIO
>> + * which maps every BAR individually to the guest
>> + * so BARs have to be PAGE aligned.
>> + */
>> + for (i = 0; i <= PCI_ROM_RESOURCE; i++) {
>> + r = &pdev->resource[i];
>> + if (!r->flags)
>> + continue;
>> + pr_debug("powernv: %s, aligning BAR#%d %llx..%llx",
>> + pdev->dev.kobj.name, i, r->start, r->end);
>> + r->end = PAGE_ALIGN(r->end - r->start + 1) - 1;
>> + r->start = 0;
>> + r->flags |= IORESOURCE_UNSET;
>> + pr_debug(" to %llx..%llx\n", r->start, r->end);
>> + }
>> +}
>> +
>> define_machine(powernv) {
>> .name = "PowerNV",
>> .probe = pnv_probe,
>> @@ -189,6 +214,7 @@ define_machine(powernv) {
>> .progress = pnv_progress,
>> .power_save = power7_idle,
>> .calibrate_decr = generic_calibrate_decr,
>> + .pcibios_fixup_resources= pnv_pcibios_fixup_resources,
>> #ifdef CONFIG_KEXEC
>> .kexec_cpu_down = pnv_kexec_cpu_down,
>> #endif
>
>
--
Alexey
next prev parent reply other threads:[~2012-09-05 0:55 UTC|newest]
Thread overview: 27+ messages / expand[flat|nested] mbox.gz Atom feed top
[not found] <20120821113534.GS29724@truffula.fritz.box>
2012-09-04 7:33 ` [PATCH] vfio: enabled and supported on power (v7) Alexey Kardashevskiy
2012-09-04 7:35 ` [PATCH] powerpc-powernv: added tce_get callback for powernv platform Alexey Kardashevskiy
2012-09-04 19:41 ` Benjamin Herrenschmidt
2012-09-04 22:35 ` David Gibson
2012-09-05 0:19 ` Alexey Kardashevskiy
2012-09-05 0:32 ` Benjamin Herrenschmidt
2012-09-04 7:36 ` [PATCH] powerpc-kvm: fixing page alignment for TCE Alexey Kardashevskiy
2012-09-04 7:36 ` Alexey Kardashevskiy
2012-09-20 9:01 ` Alexander Graf
2012-09-20 9:01 ` Alexander Graf
2012-09-04 7:36 ` [PATCH] powerpc-powernv: align BARs to PAGE_SIZE on powernv platform Alexey Kardashevskiy
2012-09-04 19:45 ` Benjamin Herrenschmidt
2012-09-05 0:55 ` Alexey Kardashevskiy [this message]
2012-09-05 1:16 ` Benjamin Herrenschmidt
2012-09-05 4:57 ` Alex Williamson
2012-09-05 5:17 ` Benjamin Herrenschmidt
2012-09-05 5:27 ` Alexey Kardashevskiy
2012-09-10 17:06 ` Alex Williamson
2012-09-10 16:02 ` [PATCH] vfio: enabled and supported on power (v7) Alex Williamson
2012-09-11 8:28 ` Alexey Kardashevskiy
2012-09-13 22:34 ` Alex Williamson
2012-09-13 22:41 ` Scott Wood
2012-09-13 22:55 ` Alex Williamson
2012-09-14 0:51 ` Alexey Kardashevskiy
2012-09-14 4:35 ` Alex Williamson
2012-10-11 8:19 ` Alexey Kardashevskiy
2012-10-11 18:09 ` Alex Williamson
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=5046A2F1.1020004@ozlabs.ru \
--to=aik@ozlabs.ru \
--cc=alex.williamson@redhat.com \
--cc=benh@kernel.crashing.org \
--cc=david@gibson.dropbear.id.au \
--cc=linuxppc-dev@lists.ozlabs.org \
--cc=paulus@samba.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.