From: Benjamin Herrenschmidt <benh@kernel.crashing.org>
To: Alexey Kardashevskiy <aik@ozlabs.ru>
Cc: linuxppc-dev@lists.ozlabs.org,
Alex Williamson <alex.williamson@redhat.com>,
Paul Mackerras <paulus@samba.org>,
David Gibson <david@gibson.dropbear.id.au>
Subject: Re: [PATCH] powerpc-powernv: align BARs to PAGE_SIZE on powernv platform
Date: Wed, 05 Sep 2012 11:16:43 +1000 [thread overview]
Message-ID: <1346807803.2257.19.camel@pasglop> (raw)
In-Reply-To: <5046A2F1.1020004@ozlabs.ru>
> > It's still bad in more ways that I care to explain...
>
> Well it is right before pci_reassigndev_resource_alignment() which is
> common and does the same thing.
>
> > The main one is that you do the "fixup" in a very wrong place anyway and
> > it might cause cases of overlapping BARs.
>
> As far as I can tell it may only happen if someone tries to align resource
> via kernel command line.
>
> But ok. I trust you :)
I have reasons to believe that this realignment crap is wrong too :-)
> > In any case this is wrong. It's a VFIO design bug and needs to be fixed
> > there (CC'ing Alex).
>
> It can be fixed in VFIO only if VFIO will stop treating functions
> separately and start mapping group's MMIO space as a whole thing. But this
> is not going to happen.
It still can be fixed without that...
> The example of the problem is NEC USB PCI which has 3 functions, each has
> one BAR, these BARs are 4K aligned and I cannot see how it can be fixed
> with 64K page size and VFIO creating memory regions per BAR (not per PHB).
VFIO can perfectly well realize it's the same MR or even map the same
area 3 times and create 3 MRs, both options work. All it needs is to
know the offset of the BAR inside the page.
> > IE. We need a way to know where the BAR is within a page at which point
> > VFIO can still map the page, but can also properly take into account the
> > offset.
>
> It is not about VFIO, it is about KVM. I cannot put non-aligned page to
> kvm_set_phys_mem(). Cannot understand how we would solve this.
No, VFIO still maps the whole page and creates an MR for the whole page,
that's fine. But you still need to know the offset within the page.
Now the main problem here is going to be that the guest itself might
reallocate the BAR and move it around (well, it's version of the BAR
which isn't the real thing), and so we cannot create a direct MMU
mapping between -that- and the real BAR.
IE. We can only allow that direct mapping if the guest BAR mapping has
the same "offset within page" as the host BAR mapping.
Our guests don't mess with BARs but SLOF does ... it's really tempting
to look into bringing the whole BAR allocation back into qemu and out of
SLOF :-( (We might have to if we ever do hotplug anyway). That way qemu
could set offsets that match appropriately.
Cheers,
Ben.
> You better discuss it with David, my vocab is weak.
>
>
>
> > We also need a way to tell VFIO userspace that it's OK to use the fast
> > path for such small BARs. It's not for all host platforms. We know it's
> > ok for PowerNV because we know the devices are grouped by PEs and the PE
> > granularity is larger than a page but that's not necessarily going to be
> > the case on all powerpc platforms that support KVM.
> >
> > Cheers,
> > Ben.
> >
> >> Signed-off-by: Alexey Kardashevskiy <aik@ozlabs.ru>
> >> ---
> >> arch/powerpc/platforms/powernv/setup.c | 26 ++++++++++++++++++++++++++
> >> 1 file changed, 26 insertions(+)
> >>
> >> diff --git a/arch/powerpc/platforms/powernv/setup.c b/arch/powerpc/platforms/powernv/setup.c
> >> index db1ad1c..331838e 100644
> >> --- a/arch/powerpc/platforms/powernv/setup.c
> >> +++ b/arch/powerpc/platforms/powernv/setup.c
> >> @@ -25,6 +25,7 @@
> >> #include <linux/of.h>
> >> #include <linux/interrupt.h>
> >> #include <linux/bug.h>
> >> +#include <linux/pci.h>
> >>
> >> #include <asm/machdep.h>
> >> #include <asm/firmware.h>
> >> @@ -179,6 +180,30 @@ static int __init pnv_probe(void)
> >> return 1;
> >> }
> >>
> >> +static void pnv_pcibios_fixup_resources(struct pci_dev *pdev)
> >> +{
> >> + struct resource *r;
> >> + int i;
> >> +
> >> + /*
> >> + * Aligning resources to PAGE_SIZE in order to
> >> + * support "fast" path for PCI BAR access under VFIO
> >> + * which maps every BAR individually to the guest
> >> + * so BARs have to be PAGE aligned.
> >> + */
> >> + for (i = 0; i <= PCI_ROM_RESOURCE; i++) {
> >> + r = &pdev->resource[i];
> >> + if (!r->flags)
> >> + continue;
> >> + pr_debug("powernv: %s, aligning BAR#%d %llx..%llx",
> >> + pdev->dev.kobj.name, i, r->start, r->end);
> >> + r->end = PAGE_ALIGN(r->end - r->start + 1) - 1;
> >> + r->start = 0;
> >> + r->flags |= IORESOURCE_UNSET;
> >> + pr_debug(" to %llx..%llx\n", r->start, r->end);
> >> + }
> >> +}
> >> +
> >> define_machine(powernv) {
> >> .name = "PowerNV",
> >> .probe = pnv_probe,
> >> @@ -189,6 +214,7 @@ define_machine(powernv) {
> >> .progress = pnv_progress,
> >> .power_save = power7_idle,
> >> .calibrate_decr = generic_calibrate_decr,
> >> + .pcibios_fixup_resources= pnv_pcibios_fixup_resources,
> >> #ifdef CONFIG_KEXEC
> >> .kexec_cpu_down = pnv_kexec_cpu_down,
> >> #endif
> >
> >
>
>
next prev parent reply other threads:[~2012-09-05 1:16 UTC|newest]
Thread overview: 25+ messages / expand[flat|nested] mbox.gz Atom feed top
[not found] <20120821113534.GS29724@truffula.fritz.box>
2012-09-04 7:33 ` [PATCH] vfio: enabled and supported on power (v7) Alexey Kardashevskiy
2012-09-04 7:35 ` [PATCH] powerpc-powernv: added tce_get callback for powernv platform Alexey Kardashevskiy
2012-09-04 19:41 ` Benjamin Herrenschmidt
2012-09-04 22:35 ` David Gibson
2012-09-05 0:19 ` Alexey Kardashevskiy
2012-09-05 0:32 ` Benjamin Herrenschmidt
2012-09-04 7:36 ` [PATCH] powerpc-kvm: fixing page alignment for TCE Alexey Kardashevskiy
2012-09-20 9:01 ` Alexander Graf
2012-09-04 7:36 ` [PATCH] powerpc-powernv: align BARs to PAGE_SIZE on powernv platform Alexey Kardashevskiy
2012-09-04 19:45 ` Benjamin Herrenschmidt
2012-09-05 0:55 ` Alexey Kardashevskiy
2012-09-05 1:16 ` Benjamin Herrenschmidt [this message]
2012-09-05 4:57 ` Alex Williamson
2012-09-05 5:17 ` Benjamin Herrenschmidt
2012-09-05 5:27 ` Alexey Kardashevskiy
2012-09-10 17:06 ` Alex Williamson
2012-09-10 16:02 ` [PATCH] vfio: enabled and supported on power (v7) Alex Williamson
2012-09-11 8:28 ` Alexey Kardashevskiy
2012-09-13 22:34 ` Alex Williamson
2012-09-13 22:41 ` Scott Wood
2012-09-13 22:55 ` Alex Williamson
2012-09-14 0:51 ` Alexey Kardashevskiy
2012-09-14 4:35 ` Alex Williamson
2012-10-11 8:19 ` Alexey Kardashevskiy
2012-10-11 18:09 ` Alex Williamson
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=1346807803.2257.19.camel@pasglop \
--to=benh@kernel.crashing.org \
--cc=aik@ozlabs.ru \
--cc=alex.williamson@redhat.com \
--cc=david@gibson.dropbear.id.au \
--cc=linuxppc-dev@lists.ozlabs.org \
--cc=paulus@samba.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox