From: Benjamin Herrenschmidt <benh@kernel.crashing.org>
To: Sowmini Varadhan <sowmini.varadhan@oracle.com>
Cc: aik@au1.ibm.com, anton@au1.ibm.com, paulus@samba.org,
sparclinux@vger.kernel.org, linuxppc-dev@lists.ozlabs.org,
davem@davemloft.net
Subject: Re: [PATCH v7 RFC 1/3] sparc: Break up monolithic iommu table/lock into finer graularity pools and lock
Date: Mon, 30 Mar 2015 21:55:43 +1100 [thread overview]
Message-ID: <1427712943.20500.77.camel@kernel.crashing.org> (raw)
In-Reply-To: <20150330103824.GA26127@oracle.com>
On Mon, 2015-03-30 at 06:38 -0400, Sowmini Varadhan wrote:
> On (03/30/15 14:24), Benjamin Herrenschmidt wrote:
> > > +
> > > +#define IOMMU_POOL_HASHBITS 4
> > > +#define IOMMU_NR_POOLS (1 << IOMMU_POOL_HASHBITS)
> >
> > I don't like those macros. You changed the value from what we had on
> > powerpc. It could be that the new values are as good for us but I'd
> > like to do a bit differently. Can you make the bits a variable ? Or at
> > least an arch-provided macro which we can change later if needed ?
>
> Actuallly, this are just the upper bound (16) on the number of pools.
>
> The actual number is selected by the value passed to the
> iommu_tbl_range_init(), and is not hard-coded (as was the case with
> the power-pc code).
>
> Thus powerpc can continue to use 4 pools without any loss of
> generality.
Right, but it affects the way we hash... not a huge deal though.
> :
> > Let's make it clear that this is for allocation of DMA space only, it
> > would thus make my life easier when adapting powerpc to use a different
> > names, something like "struct iommu_area" works for me, or "iommu
> > alloc_region" .. whatever you fancy the most.
> :
> > Why adding the 'arena' prefix ? What was wrong with "pools" in the
> > powerpc imlementation ?
>
> for the same reason you want to re-baptize iommu_table above- at
> the time, I was doing it to minimize conflicts with existing usage.
> But I can rename everything if you like.
But in that case it doesn't make much sense and makes the names longer.
Those are just "pools", it's sufficient.
> > > +#define IOMMU_LARGE_ALLOC 15
> >
> > Make that a variable, here too, the arch might want to tweak it.
> >
> > I think 15 is actually not a good value for powerpc with 4k iommu pages
> > and 64k PAGE_SIZE, we should make the above some kind of factor of
> > PAGE_SHIFT - iommu_page_shift.
>
> Ok.
>
> > > + /* Sanity check */
> > > + if (unlikely(npages == 0)) {
> > > + printk_ratelimited("npages == 0\n");
> >
> > You removed the WARN_ON here which had the nice property of giving you a
> > backtrace which points to the offender. The above message alone is
> > useless.
>
> yes, the original code was generating checkpatch warnings and errors.
> That's why I removed it.
Put it back please, and ignore checkpatch, it's become an annoyance more
than anything else.
Note that nowadays, you can probably use WARN_ON_ONCE(npages == 0); in
place of the whole construct.
> > I am not certain about the "first unlocked pool"... We take the lock for
> > a fairly short amount of time, I have the gut feeling that the cache
> > line bouncing introduced by looking at a different pool may well cost
> > more than waiting a bit longer. Did do some measurements of that
> > optimisation ?
>
> if you are really only taking it for a short amount of time, then
> the trylock should just succeed, so there is no cache line bouncing.
No that's not my point. The lock is only taken for a short time but
might still collide, the bouncing in that case will probably (at least
that's my feeling) hurt more than help.
However, I have another concern with your construct. Essentially you
spin looking for an unlocked pool without a cpu_relax. Now it's unlikely
but you *can* end up eating cycles, which on a high SMT like POWER8
might mean slowing down the actual owner of the pool lock.
> But yes, I did instrument it with iperf, and there was lock contention
> on the spinlock, which was eliminted by the trylock.
What is iperf ? What does that mean "there was lock contention" ? IE,
was the overall performance improved or not ? Removing contention by
trading it for cache line bouncing will not necessarily help. I'm not
saying this is a bad optimisation but it makes the code messy and I
think deserves some solid numbers demonstrating its worth.
> I'll fix the rest of the variable naming etc nits and send out
> a new version later today.
There was also an actual bug in the case where you hop'ed to a new pool
and forgot the flush.
Cheers,
Ben.
WARNING: multiple messages have this Message-ID (diff)
From: Benjamin Herrenschmidt <benh@kernel.crashing.org>
To: Sowmini Varadhan <sowmini.varadhan@oracle.com>
Cc: aik@au1.ibm.com, anton@au1.ibm.com, paulus@samba.org,
sparclinux@vger.kernel.org, linuxppc-dev@lists.ozlabs.org,
davem@davemloft.net
Subject: Re: [PATCH v7 RFC 1/3] sparc: Break up monolithic iommu table/lock into finer graularity pools and l
Date: Mon, 30 Mar 2015 10:55:43 +0000 [thread overview]
Message-ID: <1427712943.20500.77.camel@kernel.crashing.org> (raw)
In-Reply-To: <20150330103824.GA26127@oracle.com>
On Mon, 2015-03-30 at 06:38 -0400, Sowmini Varadhan wrote:
> On (03/30/15 14:24), Benjamin Herrenschmidt wrote:
> > > +
> > > +#define IOMMU_POOL_HASHBITS 4
> > > +#define IOMMU_NR_POOLS (1 << IOMMU_POOL_HASHBITS)
> >
> > I don't like those macros. You changed the value from what we had on
> > powerpc. It could be that the new values are as good for us but I'd
> > like to do a bit differently. Can you make the bits a variable ? Or at
> > least an arch-provided macro which we can change later if needed ?
>
> Actuallly, this are just the upper bound (16) on the number of pools.
>
> The actual number is selected by the value passed to the
> iommu_tbl_range_init(), and is not hard-coded (as was the case with
> the power-pc code).
>
> Thus powerpc can continue to use 4 pools without any loss of
> generality.
Right, but it affects the way we hash... not a huge deal though.
> :
> > Let's make it clear that this is for allocation of DMA space only, it
> > would thus make my life easier when adapting powerpc to use a different
> > names, something like "struct iommu_area" works for me, or "iommu
> > alloc_region" .. whatever you fancy the most.
> :
> > Why adding the 'arena' prefix ? What was wrong with "pools" in the
> > powerpc imlementation ?
>
> for the same reason you want to re-baptize iommu_table above- at
> the time, I was doing it to minimize conflicts with existing usage.
> But I can rename everything if you like.
But in that case it doesn't make much sense and makes the names longer.
Those are just "pools", it's sufficient.
> > > +#define IOMMU_LARGE_ALLOC 15
> >
> > Make that a variable, here too, the arch might want to tweak it.
> >
> > I think 15 is actually not a good value for powerpc with 4k iommu pages
> > and 64k PAGE_SIZE, we should make the above some kind of factor of
> > PAGE_SHIFT - iommu_page_shift.
>
> Ok.
>
> > > + /* Sanity check */
> > > + if (unlikely(npages = 0)) {
> > > + printk_ratelimited("npages = 0\n");
> >
> > You removed the WARN_ON here which had the nice property of giving you a
> > backtrace which points to the offender. The above message alone is
> > useless.
>
> yes, the original code was generating checkpatch warnings and errors.
> That's why I removed it.
Put it back please, and ignore checkpatch, it's become an annoyance more
than anything else.
Note that nowadays, you can probably use WARN_ON_ONCE(npages = 0); in
place of the whole construct.
> > I am not certain about the "first unlocked pool"... We take the lock for
> > a fairly short amount of time, I have the gut feeling that the cache
> > line bouncing introduced by looking at a different pool may well cost
> > more than waiting a bit longer. Did do some measurements of that
> > optimisation ?
>
> if you are really only taking it for a short amount of time, then
> the trylock should just succeed, so there is no cache line bouncing.
No that's not my point. The lock is only taken for a short time but
might still collide, the bouncing in that case will probably (at least
that's my feeling) hurt more than help.
However, I have another concern with your construct. Essentially you
spin looking for an unlocked pool without a cpu_relax. Now it's unlikely
but you *can* end up eating cycles, which on a high SMT like POWER8
might mean slowing down the actual owner of the pool lock.
> But yes, I did instrument it with iperf, and there was lock contention
> on the spinlock, which was eliminted by the trylock.
What is iperf ? What does that mean "there was lock contention" ? IE,
was the overall performance improved or not ? Removing contention by
trading it for cache line bouncing will not necessarily help. I'm not
saying this is a bad optimisation but it makes the code messy and I
think deserves some solid numbers demonstrating its worth.
> I'll fix the rest of the variable naming etc nits and send out
> a new version later today.
There was also an actual bug in the case where you hop'ed to a new pool
and forgot the flush.
Cheers,
Ben.
next prev parent reply other threads:[~2015-03-30 10:55 UTC|newest]
Thread overview: 28+ messages / expand[flat|nested] mbox.gz Atom feed top
2015-03-25 17:34 [PATCH v7 0/3] Generic IOMMU pooled allocator Sowmini Varadhan
2015-03-25 17:34 ` Sowmini Varadhan
2015-03-25 17:34 ` [PATCH v7 RFC 1/3] sparc: Break up monolithic iommu table/lock into finer graularity pools and lock Sowmini Varadhan
2015-03-25 17:34 ` Sowmini Varadhan
2015-03-30 3:24 ` Benjamin Herrenschmidt
2015-03-30 3:24 ` [PATCH v7 RFC 1/3] sparc: Break up monolithic iommu table/lock into finer graularity pools and l Benjamin Herrenschmidt
2015-03-30 10:38 ` [PATCH v7 RFC 1/3] sparc: Break up monolithic iommu table/lock into finer graularity pools and lock Sowmini Varadhan
2015-03-30 10:38 ` [PATCH v7 RFC 1/3] sparc: Break up monolithic iommu table/lock into finer graularity pools and l Sowmini Varadhan
2015-03-30 10:55 ` Benjamin Herrenschmidt [this message]
2015-03-30 10:55 ` Benjamin Herrenschmidt
2015-03-30 13:01 ` [PATCH v7 RFC 1/3] sparc: Break up monolithic iommu table/lock into finer graularity pools and lock Sowmini Varadhan
2015-03-30 13:01 ` [PATCH v7 RFC 1/3] sparc: Break up monolithic iommu table/lock into finer graularity pools and l Sowmini Varadhan
2015-03-30 21:15 ` [PATCH v7 RFC 1/3] sparc: Break up monolithic iommu table/lock into finer graularity pools and lock Sowmini Varadhan
2015-03-30 21:15 ` [PATCH v7 RFC 1/3] sparc: Break up monolithic iommu table/lock into finer graularity pools and l Sowmini Varadhan
2015-03-30 21:28 ` [PATCH v7 RFC 1/3] sparc: Break up monolithic iommu table/lock into finer graularity pools and lock Benjamin Herrenschmidt
2015-03-30 21:28 ` [PATCH v7 RFC 1/3] sparc: Break up monolithic iommu table/lock into finer graularity pools and l Benjamin Herrenschmidt
2015-03-30 21:38 ` [PATCH v7 RFC 1/3] sparc: Break up monolithic iommu table/lock into finer graularity pools and lock Sowmini Varadhan
2015-03-30 21:38 ` [PATCH v7 RFC 1/3] sparc: Break up monolithic iommu table/lock into finer graularity pools and l Sowmini Varadhan
2015-03-25 17:34 ` [PATCH v7 RFC 2/3] sparc: Make sparc64 use scalable lib/iommu-common.c functions Sowmini Varadhan
2015-03-25 17:34 ` Sowmini Varadhan
2015-03-25 17:34 ` [PATCH v7 RFC 3/3] sparc: Make LDC use common iommu poll management functions Sowmini Varadhan
2015-03-25 17:34 ` Sowmini Varadhan
2015-03-25 18:12 ` [PATCH v7 0/3] Generic IOMMU pooled allocator David Miller
2015-03-25 18:12 ` David Miller
2015-03-25 21:05 ` Benjamin Herrenschmidt
2015-03-25 21:05 ` Benjamin Herrenschmidt
2015-03-27 10:51 ` Sowmini Varadhan
2015-03-27 10:51 ` Sowmini Varadhan
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=1427712943.20500.77.camel@kernel.crashing.org \
--to=benh@kernel.crashing.org \
--cc=aik@au1.ibm.com \
--cc=anton@au1.ibm.com \
--cc=davem@davemloft.net \
--cc=linuxppc-dev@lists.ozlabs.org \
--cc=paulus@samba.org \
--cc=sowmini.varadhan@oracle.com \
--cc=sparclinux@vger.kernel.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.