From: Leonardo Bras <leobras.c@gmail.com>
To: Alexey Kardashevskiy <aik@ozlabs.ru>, linuxppc-dev@lists.ozlabs.org
Cc: kvm-ppc@vger.kernel.org, David Gibson <david@gibson.dropbear.id.au>
Subject: Re: [PATCH kernel 2/2] powerpc/iommu: Do not immediately panic when failed IOMMU table allocation
Date: Mon, 22 Feb 2021 15:39:34 -0300 [thread overview]
Message-ID: <6b4bed6881f66d9ebb45486042966f78593ac240.camel@gmail.com> (raw)
In-Reply-To: <0ae74efe-8b9a-566a-d984-8e5b662ff330@ozlabs.ru>
On Mon, 2021-02-22 at 16:24 +1100, Alexey Kardashevskiy wrote:
>
> On 18/02/2021 06:32, Leonardo Bras wrote:
> > On Tue, 2021-02-16 at 14:33 +1100, Alexey Kardashevskiy wrote:
> > > Most platforms allocate IOMMU table structures (specifically it_map)
> > > at the boot time and when this fails - it is a valid reason for panic().
> > >
> > > However the powernv platform allocates it_map after a device is returned
> > > to the host OS after being passed through and this happens long after
> > > the host OS booted. It is quite possible to trigger the it_map allocation
> > > panic() and kill the host even though it is not necessary - the host OS
> > > can still use the DMA bypass mode (requires a tiny fraction of it_map's
> > > memory) and even if that fails, the host OS is runnnable as it was without
> > > the device for which allocating it_map causes the panic.
> > >
> > > Instead of immediately crashing in a powernv/ioda2 system, this prints
> > > an error and continues. All other platforms still call panic().
> > >
> > > Signed-off-by: Alexey Kardashevskiy <aik@ozlabs.ru>
> >
> > Hello Alexey,
> >
> > This looks like a good change, that passes panic() decision to platform
> > code. Everything looks pretty straightforward, but I have a question
> > regarding this:
> >
> > > @@ -1930,16 +1931,16 @@ static long pnv_pci_ioda2_setup_default_config(struct pnv_ioda_pe *pe)
> > > res_start = pe->phb->ioda.m32_pci_base >> tbl->it_page_shift;
> > > res_end = min(window_size, SZ_4G) >> tbl->it_page_shift;
> > > }
> > > - iommu_init_table(tbl, pe->phb->hose->node, res_start, res_end);
> > > - rc = pnv_pci_ioda2_set_window(&pe->table_group, 0, tbl);
> > >
> > > + if (iommu_init_table(tbl, pe->phb->hose->node, res_start, res_end))
> > > + rc = pnv_pci_ioda2_set_window(&pe->table_group, 0, tbl);
> > > + else
> > > + rc = -ENOMEM;
> > > if (rc) {
> > > - pe_err(pe, "Failed to configure 32-bit TCE table, err %ld\n",
> > > - rc);
> > > + pe_err(pe, "Failed to configure 32-bit TCE table, err %ld\n", rc);
> > > iommu_tce_table_put(tbl);
> > > - return rc;
> > > + tbl = NULL; /* This clears iommu_table_base below */
> > > }
> > > -
> > > if (!pnv_iommu_bypass_disabled)
> > > pnv_pci_ioda2_set_bypass(pe, true);
> > >
> > >
> > >
> > >
> > >
> >
> > If I could understand correctly, previously if iommu_init_table() did
> > not panic(), and pnv_pci_ioda2_set_window() returned something other
> > than 0, it would return rc in the if (rc) clause, but now it does not
> > happen anymore, going through if (!pnv_iommu_bypass_disabled) onwards.
> >
> > Is that desired?
>
>
> Yes. A PE (==device, pretty much) has 2 DMA windows:
> - the default one which requires some RAM to operate
> - a bypass mode which tells the hardware that PCI addresses are
> statically mapped to RAM 1:1.
>
> This bypass mode does not require extra memory to work and is used in
> the most cases on the bare metal as long as the device supports 64bit
> DMA which is everything except GPUs. Since it is cheap to enable and
> this what we prefer anyway, no urge to fail.
>
>
> > As far as I could see, returning rc there seems a good procedure after
> > iommu_init_table returning -ENOMEM.
>
> This change is intentional and yes it could be done by a separate patch
> but I figured there is no that much value in splitting.
Ok then, thanks for clarifying.
FWIW:
Reviewed-by: Leonardo Bras <leobras.c@gmail.com>
next prev parent reply other threads:[~2021-02-22 18:40 UTC|newest]
Thread overview: 10+ messages / expand[flat|nested] mbox.gz Atom feed top
2021-02-16 3:33 [PATCH kernel 0/2] powerpc/iommu: Stop crashing the host when VM is terminated Alexey Kardashevskiy
2021-02-16 3:33 ` [PATCH kernel 1/2] powerpc/iommu: Allocate it_map by vmalloc Alexey Kardashevskiy
2021-02-17 0:16 ` David Gibson
2021-02-17 19:11 ` Leonardo Bras
2021-02-16 3:33 ` [PATCH kernel 2/2] powerpc/iommu: Do not immediately panic when failed IOMMU table allocation Alexey Kardashevskiy
2021-02-17 0:16 ` David Gibson
2021-02-17 19:32 ` Leonardo Bras
2021-02-22 5:24 ` Alexey Kardashevskiy
2021-02-22 18:39 ` Leonardo Bras [this message]
2021-04-29 14:01 ` [PATCH kernel 0/2] powerpc/iommu: Stop crashing the host when VM is terminated Michael Ellerman
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=6b4bed6881f66d9ebb45486042966f78593ac240.camel@gmail.com \
--to=leobras.c@gmail.com \
--cc=aik@ozlabs.ru \
--cc=david@gibson.dropbear.id.au \
--cc=kvm-ppc@vger.kernel.org \
--cc=linuxppc-dev@lists.ozlabs.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).