From: Alistair Popple <alistair@popple.id.au>
To: linuxppc-dev@lists.ozlabs.org
Cc: "Jose Ricardo Ziviani" <joserz@linux.ibm.com>,
kvm@vger.kernel.org, "Alexey Kardashevskiy" <aik@ozlabs.ru>,
"Daniel Henrique Barboza" <danielhb413@gmail.com>,
"Piotr Jaroszynski" <pjaroszynski@nvidia.com>,
kvm-ppc@vger.kernel.org, "Sam Bobroff" <sbobroff@linux.ibm.com>,
"Alex Williamson" <alex.williamson@redhat.com>,
"Leonardo Augusto Guimarães Garcia" <lagarcia@br.ibm.com>,
"Reza Arbab" <arbab@linux.ibm.com>,
"David Gibson" <david@gibson.dropbear.id.au>
Subject: Re: [PATCH kernel v3] powerpc/powernv: Isolate NVLinks between GV100GL on Witherspoon
Date: Tue, 30 Apr 2019 15:45:28 +1000 [thread overview]
Message-ID: <5149814.2BROG1NTNO@townsend> (raw)
In-Reply-To: <da41cd35-32f6-043e-13ab-9a225c4e910a@ozlabs.ru>
Alexey,
> >>> +void pnv_try_isolate_nvidia_v100(struct pci_dev *bridge)
> >>> +{
> >>> + u32 mask, val;
> >>> + void __iomem *bar0_0, *bar0_120000, *bar0_a00000;
> >>> + struct pci_dev *pdev;
> >>> + u16 cmd = 0, cmdmask = PCI_COMMAND_MEMORY;
> >>> +
> >>> + if (!bridge->subordinate)
> >>> + return;
> >>> +
> >>> + pdev = list_first_entry_or_null(&bridge->subordinate->devices,
> >>> + struct pci_dev, bus_list);
> >>> + if (!pdev)
> >>> + return;
> >>> +
> >>> + if (pdev->vendor != PCI_VENDOR_ID_NVIDIA)
Don't you also need to check the PCIe devid to match only [PV]100 devices as
well? I doubt there's any guarantee these registers will remain the same for
all future (or older) NVIDIA devices.
IMHO this should really be done in the device driver in the guest. A malcious
guest could load a modified driver that doesn't do this, but that should not
compromise other guests which presumably load a non-compromised driver that
disables the links on that guests GPU. However I guess in practice what you
have here should work equally well.
- Alistair
> >>> + return;
> >>> +
> >>> + mask = nvlinkgpu_get_disable_mask(&pdev->dev);
> >>> + if (!mask)
> >>> + return;
> >>> +
> >>> + bar0_0 = pci_iomap_range(pdev, 0, 0, 0x10000);
> >>> + if (!bar0_0) {
> >>> + pci_err(pdev, "Error mapping BAR0 @0\n");
> >>> + return;
> >>> + }
> >>> + bar0_120000 = pci_iomap_range(pdev, 0, 0x120000, 0x10000);
> >>> + if (!bar0_120000) {
> >>> + pci_err(pdev, "Error mapping BAR0 @120000\n");
> >>> + goto bar0_0_unmap;
> >>> + }
> >>> + bar0_a00000 = pci_iomap_range(pdev, 0, 0xA00000, 0x10000);
> >>> + if (!bar0_a00000) {
> >>> + pci_err(pdev, "Error mapping BAR0 @A00000\n");
> >>> + goto bar0_120000_unmap;
> >>> + }
> >>
> >> Is it really necessary to do three separate ioremaps vs one that would
> >> cover them all here? I suspect you're just sneaking in PAGE_SIZE with
> >> the 0x10000 size mappings anyway. Seems like it would simplify setup,
> >> error reporting, and cleanup to to ioremap to the PAGE_ALIGN'd range
> >> of the highest register accessed. Thanks,
> >
> > Sure I can map it once, I just do not see the point in mapping/unmapping
> > all 0xa10000>>16=161 system pages for a very short period of time while
> > we know precisely that we need just 3 pages.
> >
> > Repost?
>
> Ping?
>
> Can this go in as it is (i.e. should I ping Michael) or this needs
> another round? It would be nice to get some formal acks. Thanks,
>
> >> Alex
> >>
> >>> +
> >>> + pci_restore_state(pdev);
> >>> + pci_read_config_word(pdev, PCI_COMMAND, &cmd);
> >>> + if ((cmd & cmdmask) != cmdmask)
> >>> + pci_write_config_word(pdev, PCI_COMMAND, cmd | cmdmask);
> >>> +
> >>> + /*
> >>> + * The sequence is from "Tesla P100 and V100 SXM2 NVLink Isolation on
> >>> + * Multi-Tenant Systems".
> >>> + * The register names are not provided there either, hence raw values.
> >>> + */
> >>> + iowrite32(0x4, bar0_120000 + 0x4C);
> >>> + iowrite32(0x2, bar0_120000 + 0x2204);
> >>> + val = ioread32(bar0_0 + 0x200);
> >>> + val |= 0x02000000;
> >>> + iowrite32(val, bar0_0 + 0x200);
> >>> + val = ioread32(bar0_a00000 + 0x148);
> >>> + val |= mask;
> >>> + iowrite32(val, bar0_a00000 + 0x148);
> >>> +
> >>> + if ((cmd | cmdmask) != cmd)
> >>> + pci_write_config_word(pdev, PCI_COMMAND, cmd);
> >>> +
> >>> + pci_iounmap(pdev, bar0_a00000);
> >>> +bar0_120000_unmap:
> >>> + pci_iounmap(pdev, bar0_120000);
> >>> +bar0_0_unmap:
> >>> + pci_iounmap(pdev, bar0_0);
> >>> +}
next prev parent reply other threads:[~2019-04-30 5:48 UTC|newest]
Thread overview: 7+ messages / expand[flat|nested] mbox.gz Atom feed top
2019-04-11 6:48 [PATCH kernel v3] powerpc/powernv: Isolate NVLinks between GV100GL on Witherspoon Alexey Kardashevskiy
2019-04-11 16:52 ` Alex Williamson
2019-04-12 3:48 ` Alexey Kardashevskiy
2019-04-29 6:50 ` Alexey Kardashevskiy
2019-04-30 5:45 ` Alistair Popple [this message]
2019-04-30 6:14 ` Alexey Kardashevskiy
2019-04-30 13:20 ` Alex Williamson
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=5149814.2BROG1NTNO@townsend \
--to=alistair@popple.id.au \
--cc=aik@ozlabs.ru \
--cc=alex.williamson@redhat.com \
--cc=arbab@linux.ibm.com \
--cc=danielhb413@gmail.com \
--cc=david@gibson.dropbear.id.au \
--cc=joserz@linux.ibm.com \
--cc=kvm-ppc@vger.kernel.org \
--cc=kvm@vger.kernel.org \
--cc=lagarcia@br.ibm.com \
--cc=linuxppc-dev@lists.ozlabs.org \
--cc=pjaroszynski@nvidia.com \
--cc=sbobroff@linux.ibm.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).