linuxppc-dev.lists.ozlabs.org archive mirror
 help / color / mirror / Atom feed
From: Leonardo Bras <leobras.c@gmail.com>
To: Alexey Kardashevskiy <aik@ozlabs.ru>,
	Michael Ellerman <mpe@ellerman.id.au>,
	 Benjamin Herrenschmidt <benh@kernel.crashing.org>,
	Paul Mackerras <paulus@samba.org>, Joel Stanley <joel@jms.id.au>,
	 Christophe Leroy <christophe.leroy@c-s.fr>,
	Thiago Jung Bauermann <bauerman@linux.ibm.com>,
	Ram Pai <linuxram@us.ibm.com>,
	Brian King <brking@linux.vnet.ibm.com>
Cc: linuxppc-dev@lists.ozlabs.org, linux-kernel@vger.kernel.org
Subject: Re: [PATCH v4 5/7] powerpc/iommu: Move iommu_table cleaning routine to iommu_table_clean
Date: Wed, 22 Jul 2020 20:37:36 -0300	[thread overview]
Message-ID: <396d343962bb4e3b845bab4f46721a5ebed39cf6.camel@gmail.com> (raw)
In-Reply-To: <dd26d682-f013-763d-3a92-6d99633c6175@ozlabs.ru>

On Wed, 2020-07-22 at 11:28 +1000, Alexey Kardashevskiy wrote:
> 
> On 22/07/2020 08:13, Leonardo Bras wrote:
> > On Tue, 2020-07-21 at 14:59 +1000, Alexey Kardashevskiy wrote:
> > > On 16/07/2020 17:16, Leonardo Bras wrote:
> > > > Move the part of iommu_table_free() that does struct iommu_table cleaning
> > > > into iommu_table_clean, so we can invoke it separately.
> > > > 
> > > > This new function is useful for cleaning struct iommu_table before
> > > > initializing it again with a new DMA window, without having it freed and
> > > > allocated again.
> > > > 
> > > > Signed-off-by: Leonardo Bras <leobras.c@gmail.com>
> > > > ---
> > > >  arch/powerpc/kernel/iommu.c | 30 ++++++++++++++++++------------
> > > >  1 file changed, 18 insertions(+), 12 deletions(-)
> > > > 
> > > > diff --git a/arch/powerpc/kernel/iommu.c b/arch/powerpc/kernel/iommu.c
> > > > index 9704f3f76e63..c3242253a4e7 100644
> > > > --- a/arch/powerpc/kernel/iommu.c
> > > > +++ b/arch/powerpc/kernel/iommu.c
> > > > @@ -735,21 +735,10 @@ struct iommu_table *iommu_init_table(struct iommu_table *tbl, int nid,
> > > >  	return tbl;
> > > >  }
> > > >  
> > > > -static void iommu_table_free(struct kref *kref)
> > > > +static void iommu_table_clean(struct iommu_table *tbl)
> > > 
> > > iommu_table_free() + iommu_init_table() + set_iommu_table_base() should
> > > work too, why new helper?
> > 
> > iommu_table_free() also frees the tbl, which would need allocate it
> > again (new address) and to fill it up again, unnecessarily. 
> 
> It is a new table in fact, everything is new there. You are only saving
> kfree+kzalloc which does not seem a huge win.
> 
> Also, iommu_table_update() simply assumes 64bit window by passing
> res_start=res_end=0 to iommu_init_table() which is not horribly robust
> either. Yeah, I know, iommu_init_table() is always called with zeroes in
> pseries but this is somewhat ok as those tables are from the device tree
> and those windows don't overlap with 32bit MMIO but under KVM they will
> (well, if we hack QEMU to advertise a single window).
> 
> I suggest removing iommu_pseries_table_update() from 6/7 and do
> iommu_table_free() + iommu_init_table() + set_iommu_table_base() with a
> WARN_ON(pdev->dev.archdata.dma_offset>=SZ_4G), may be even do this all
> in enable_ddw() where we know for sure if it is 1:1 mapping or just a
> big window.

Sure, I have yet to understand the full impact of this change, but I
will implement this and give it a try.

> 
> Out of curiosity - what page sizes does pHyp advertise in "query"?

64kB (page shift 0x10)

> 
> 
> > I think it's a better approach to only change what is needed.
> > 
> > > There is also iommu_table_clear() which does a different thing so you
> > > need a better name.
> > 
> > I agree.
> > I had not noticed this other function before sending the patchset. What
> > would be a better name though? __iommu_table_free()? 
> > 
> > > Second, iommu_table_free
> > > use and it would be ok as we would only see this when hot-unplugging a
> > > PE because we always kept the default window.
> > > Btw you must be seeing these warnings now every time you create DDW with
> > > these patches as at least the first page is reserved, do not you?
> > 
> > It does not print a warning.
> > I noticed other warnings,
> 
> And what are these?

tce_freemulti_pSeriesLP: plpar_tce_stuff failed
[...]

It's regarding the change in pagesize. 
Some places have the tceshift hardcoded as 12, tce_freemulti_pSeriesLP
is one of them, and that is causing some errors.

I wrote a patch fixing this, and I will include it in the next series.

> 
> > but not this one from iommu_table_free():
> > /* verify that table contains no entries */
> > if (!bitmap_empty(tbl->it_ma
> > p, tbl->it_size))
> > 	pr_warn("%s: Unexpected TCEs\n", __func__);
> > 
> > Before that, iommu_table_release_pages(tbl) is supposed to clear the 
> > bitmap, so this only tests for a tce that is created in this short period.
> 
> iommu_table_release_pages() only clears reserved pages - page 0 (just a
> protection against NULL DMA pointers) and 32bit MMIO (these should not
> be set for 64bit window). The "%s: Unexpected TCEs\n" is what checks for
> actual mapped TCEs.
> 

Oh, I haven't noticed that. Thanks for pointing!

> > > Since we are replacing a table for a device which is still in the
> > > system, we should not try messing with its DMA if it already has
> > > mappings so the warning should become an error preventing DDW. It is
> > > rather hard to trigger in practice but I could hack a driver to ask for
> > > 32bit DMA mask first, map few pages and then ask for 64bit DMA mask, it
> > > is not illegal, I think. So this needs a new helper - "bool
> > > iommu_table_in_use(tbl)" - to use in enable_ddw(). Or I am overthinking
> > > this?... Thanks,
> > 
> > As of today, there seems to be nothing like that happening in the
> > driver I am testing. 
> > I spoke to Brian King on slack, and he mentioned that at the point DDW
> > is created there should be no allocations in place.
> 
> Correct, there should not be. But it is also not a huge effort to fall
> back if there are.

True.

> 
> > But I suppose some driver could try to do this.
> > 
> > Maybe a better approach would be removing the mapping only if the
> > default window is removed (at the end of enable_ddw, as an else to
> > resetting the default DMA window), and having a way to add more
> > mappings to those pools. But this last part doesn't look so simple, and
> > it would be better to understand if it's necessary investing work in
> > this.
> > 
> > What do you think?
> 
> Add iommu_table_in_use(tbl) and fail DDW if that says "yes".

Seems good, I will include that on the next patchset.

Still, I will try to implement that more complex approach in a future
patchset, as it may come to be useful.

Thank you for the feedback!


  reply	other threads:[~2020-07-22 23:39 UTC|newest]

Thread overview: 20+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2020-07-16  7:16 [PATCH v4 0/7] Remove default DMA window before creating DDW Leonardo Bras
2020-07-16  7:16 ` [PATCH v4 1/7] powerpc/pseries/iommu: Create defines for operations in ibm, ddw-applicable Leonardo Bras
2020-08-04 21:27   ` David Dai
2020-08-04 21:34   ` David Dai
2020-07-16  7:16 ` [PATCH v4 2/7] powerpc/pseries/iommu: Update call to ibm, query-pe-dma-windows Leonardo Bras
2020-08-04 21:31   ` David Dai
2020-07-16  7:16 ` [PATCH v4 3/7] powerpc/pseries/iommu: Move window-removing part of remove_ddw into remove_dma_window Leonardo Bras
2020-08-04 21:32   ` David Dai
2020-07-16  7:16 ` [PATCH v4 4/7] powerpc/pseries/iommu: Remove default DMA window before creating DDW Leonardo Bras
2020-08-04 21:33   ` David Dai
2020-07-16  7:16 ` [PATCH v4 5/7] powerpc/iommu: Move iommu_table cleaning routine to iommu_table_clean Leonardo Bras
2020-07-21  4:59   ` Alexey Kardashevskiy
2020-07-21 22:13     ` Leonardo Bras
2020-07-22  0:52       ` Brian King
2020-07-22 23:45         ` Leonardo Bras
2020-07-22  1:28       ` Alexey Kardashevskiy
2020-07-22 23:37         ` Leonardo Bras [this message]
2020-07-16  7:16 ` [PATCH v4 6/7] powerpc/pseries/iommu: Make use of DDW even if it does not map the partition Leonardo Bras
2020-07-21 21:39   ` Leonardo Bras
2020-07-16  7:16 ` [PATCH v4 7/7] powerpc/pseries/iommu: Rename "direct window" to "dma window" Leonardo Bras

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=396d343962bb4e3b845bab4f46721a5ebed39cf6.camel@gmail.com \
    --to=leobras.c@gmail.com \
    --cc=aik@ozlabs.ru \
    --cc=bauerman@linux.ibm.com \
    --cc=benh@kernel.crashing.org \
    --cc=brking@linux.vnet.ibm.com \
    --cc=christophe.leroy@c-s.fr \
    --cc=joel@jms.id.au \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linuxppc-dev@lists.ozlabs.org \
    --cc=linuxram@us.ibm.com \
    --cc=mpe@ellerman.id.au \
    --cc=paulus@samba.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).