All of lore.kernel.org
 help / color / mirror / Atom feed
From: Robin Murphy <robin.murphy-5wv7dgnIgG8@public.gmane.org>
To: Joerg Roedel <joro-zLv9SwRftAIdnm+yROfE0A@public.gmane.org>
Cc: iommu-cunTk1MwBs9QetFLy7KEm3xJsTq8ys+cHZ5vskTnxNA@public.gmane.org,
	nd-5wv7dgnIgG8@public.gmane.org,
	linux-kernel-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
Subject: Re: [PATCH 3/3] iommu/iova: Try harder to allocate from rcache magazine
Date: Wed, 27 Sep 2017 17:50:31 +0100	[thread overview]
Message-ID: <20170927175031.7fef6fdd@m750.lan> (raw)
In-Reply-To: <20170927140051.GO8398-zLv9SwRftAIdnm+yROfE0A@public.gmane.org>

On Wed, 27 Sep 2017 16:00:51 +0200
Joerg Roedel <joro-zLv9SwRftAIdnm+yROfE0A@public.gmane.org> wrote:

> On Tue, Sep 19, 2017 at 02:48:41PM +0100, Robin Murphy wrote:
> > When devices with different DMA masks are using the same domain, or
> > for PCI devices where we usually try a speculative 32-bit
> > allocation first, there is a fair possibility that the top PFN of
> > the rcache stack at any given time may be unsuitable for the lower
> > limit, prompting a fallback to allocating anew from the rbtree.
> > Consequently, we may end up artifically increasing pressure on the
> > 32-bit IOVA space as unused IOVAs accumulate lower down in the
> > rcache stacks, while callers with 32-bit masks also impose
> > unnecessary rbtree overhead.
> > 
> > In such cases, let's try a bit harder to satisfy the allocation
> > locally first - scanning the whole stack should still be relatively
> > inexpensive, and even rotating an entry up from the very bottom
> > probably has less overall impact than going to the rbtree.
> > 
> > Signed-off-by: Robin Murphy <robin.murphy-5wv7dgnIgG8@public.gmane.org>
> > ---
> >  drivers/iommu/iova.c | 19 ++++++++++++++++---
> >  1 file changed, 16 insertions(+), 3 deletions(-)
> > 
> > diff --git a/drivers/iommu/iova.c b/drivers/iommu/iova.c
> > index 8f8b436afd81..a7af8273fa98 100644
> > --- a/drivers/iommu/iova.c
> > +++ b/drivers/iommu/iova.c
> > @@ -826,12 +826,25 @@ static bool iova_magazine_empty(struct
> > iova_magazine *mag) static unsigned long iova_magazine_pop(struct
> > iova_magazine *mag, unsigned long limit_pfn)
> >  {
> > +	int i;
> > +	unsigned long pfn;
> > +
> >  	BUG_ON(iova_magazine_empty(mag));
> >  
> > -	if (mag->pfns[mag->size - 1] > limit_pfn)
> > -		return 0;
> > +	/*
> > +	 * If we can pull a suitable pfn from anywhere in the
> > stack, that's
> > +	 * still probably preferable to falling back to the rbtree.
> > +	 */
> > +	for (i = mag->size - 1; mag->pfns[i] > limit_pfn; i--)
> > +		if (i == 0)
> > +			return 0;
> >  
> > -	return mag->pfns[--mag->size];
> > +	pfn = mag->pfns[i];
> > +	mag->size--;
> > +	for (; i < mag->size; i++)
> > +		mag->pfns[i] = mag->pfns[i + 1];  
> 
> Do we need to preserve the order of the elements on the stack or would
> it also suffice to just copy the top-element to the position we are
> removing?

Ooh, good point - the order is more or less meaningless, and if it *did*
matter then that would imply we couldn't do this anyway. Getting rid of
the second loop makes it even more compelling.

Robin.

WARNING: multiple messages have this Message-ID (diff)
From: Robin Murphy <robin.murphy@arm.com>
To: Joerg Roedel <joro@8bytes.org>
Cc: iommu@lists.linux-foundation.org, linux-kernel@vger.kernel.org,
	nd@arm.com
Subject: Re: [PATCH 3/3] iommu/iova: Try harder to allocate from rcache magazine
Date: Wed, 27 Sep 2017 17:50:31 +0100	[thread overview]
Message-ID: <20170927175031.7fef6fdd@m750.lan> (raw)
In-Reply-To: <20170927140051.GO8398@8bytes.org>

On Wed, 27 Sep 2017 16:00:51 +0200
Joerg Roedel <joro@8bytes.org> wrote:

> On Tue, Sep 19, 2017 at 02:48:41PM +0100, Robin Murphy wrote:
> > When devices with different DMA masks are using the same domain, or
> > for PCI devices where we usually try a speculative 32-bit
> > allocation first, there is a fair possibility that the top PFN of
> > the rcache stack at any given time may be unsuitable for the lower
> > limit, prompting a fallback to allocating anew from the rbtree.
> > Consequently, we may end up artifically increasing pressure on the
> > 32-bit IOVA space as unused IOVAs accumulate lower down in the
> > rcache stacks, while callers with 32-bit masks also impose
> > unnecessary rbtree overhead.
> > 
> > In such cases, let's try a bit harder to satisfy the allocation
> > locally first - scanning the whole stack should still be relatively
> > inexpensive, and even rotating an entry up from the very bottom
> > probably has less overall impact than going to the rbtree.
> > 
> > Signed-off-by: Robin Murphy <robin.murphy@arm.com>
> > ---
> >  drivers/iommu/iova.c | 19 ++++++++++++++++---
> >  1 file changed, 16 insertions(+), 3 deletions(-)
> > 
> > diff --git a/drivers/iommu/iova.c b/drivers/iommu/iova.c
> > index 8f8b436afd81..a7af8273fa98 100644
> > --- a/drivers/iommu/iova.c
> > +++ b/drivers/iommu/iova.c
> > @@ -826,12 +826,25 @@ static bool iova_magazine_empty(struct
> > iova_magazine *mag) static unsigned long iova_magazine_pop(struct
> > iova_magazine *mag, unsigned long limit_pfn)
> >  {
> > +	int i;
> > +	unsigned long pfn;
> > +
> >  	BUG_ON(iova_magazine_empty(mag));
> >  
> > -	if (mag->pfns[mag->size - 1] > limit_pfn)
> > -		return 0;
> > +	/*
> > +	 * If we can pull a suitable pfn from anywhere in the
> > stack, that's
> > +	 * still probably preferable to falling back to the rbtree.
> > +	 */
> > +	for (i = mag->size - 1; mag->pfns[i] > limit_pfn; i--)
> > +		if (i == 0)
> > +			return 0;
> >  
> > -	return mag->pfns[--mag->size];
> > +	pfn = mag->pfns[i];
> > +	mag->size--;
> > +	for (; i < mag->size; i++)
> > +		mag->pfns[i] = mag->pfns[i + 1];  
> 
> Do we need to preserve the order of the elements on the stack or would
> it also suffice to just copy the top-element to the position we are
> removing?

Ooh, good point - the order is more or less meaningless, and if it *did*
matter then that would imply we couldn't do this anyway. Getting rid of
the second loop makes it even more compelling.

Robin.

  parent reply	other threads:[~2017-09-27 16:50 UTC|newest]

Thread overview: 11+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2017-09-19 13:48 [PATCH 0/3] Misc IOVA tweaks Robin Murphy
     [not found] ` <cover.1505827369.git.robin.murphy-5wv7dgnIgG8@public.gmane.org>
2017-09-19 13:48   ` [PATCH 1/3] iommu/iova: Simplify domain destruction Robin Murphy
2017-09-19 13:48     ` Robin Murphy
2017-09-19 13:48 ` [PATCH 2/3] iommu/iova: Make rcache limit_pfn handling more robust Robin Murphy
2017-09-19 13:48 ` [PATCH 3/3] iommu/iova: Try harder to allocate from rcache magazine Robin Murphy
     [not found]   ` <8127fabc219811d8169189e9d7177d42bc74bcbf.1505827369.git.robin.murphy-5wv7dgnIgG8@public.gmane.org>
2017-09-27 14:00     ` Joerg Roedel
2017-09-27 14:00       ` Joerg Roedel
     [not found]       ` <20170927140051.GO8398-zLv9SwRftAIdnm+yROfE0A@public.gmane.org>
2017-09-27 16:50         ` Robin Murphy [this message]
2017-09-27 16:50           ` Robin Murphy
2017-09-28 10:31   ` [PATCH v2] " Robin Murphy
2017-09-28 13:41     ` Joerg Roedel

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20170927175031.7fef6fdd@m750.lan \
    --to=robin.murphy-5wv7dgnigg8@public.gmane.org \
    --cc=iommu-cunTk1MwBs9QetFLy7KEm3xJsTq8ys+cHZ5vskTnxNA@public.gmane.org \
    --cc=joro-zLv9SwRftAIdnm+yROfE0A@public.gmane.org \
    --cc=linux-kernel-u79uwXL29TY76Z2rM5mHXA@public.gmane.org \
    --cc=nd-5wv7dgnIgG8@public.gmane.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.