All of lore.kernel.org
 help / color / mirror / Atom feed
From: Imre Deak <imre.deak@intel.com>
To: Andrew Morton <akpm@linux-foundation.org>
Cc: linux-kernel@vger.kernel.org,
	Maxim Levitsky <maximlevitsky@gmail.com>,
	Tejun Heo <tj@kernel.org>, Daniel Vetter <daniel.vetter@ffwll.ch>,
	linaro-mm-sig@lists.linaro.org
Subject: Re: [PATCH v2] lib/scatterlist: add simple page iterator
Date: Tue, 12 Feb 2013 19:07:20 +0200	[thread overview]
Message-ID: <1360688840.5578.75.camel@localhost> (raw)
In-Reply-To: <20130211125422.978c1e99.akpm@linux-foundation.org>

On Mon, 2013-02-11 at 12:54 -0800, Andrew Morton wrote:
> On Mon, 11 Feb 2013 20:50:04 +0200
> Imre Deak <imre.deak@intel.com> wrote:
> 
> > Add an iterator to walk through a scatter list a page at a time starting
> > at a specific page offset. As opposed to the mapping iterator this is
> 
> What is "the mapping iterator"?

It's the one implemented by sg_miter_{start,stop} in scatterlist.c. It
also iterates through a scatterlist a page at a time, but it also kmaps
these pages. Since in our use case we don't need to map the pages we
needed a solution without this overhead.

> > meant to be small, performing well even in simple loops like collecting
> > all pages on the scatterlist into an array or setting up an iommu table
> > based on the pages' DMA address.
> 
> Where will this new macro be used?  What is driving this effort?

At the moment the only user of the macro would be the i915 driver, see
[1] for the patches that takes it into use. In the patchset the macro
was added as a DRM specific macro, but since it might be useful in the
future for other drivers too (anything using dma-buf) I'd like to add it
to a more generic place.

> > v2:
> > - In each iteration sg_pgoffset pointed incorrectly at the next page not
> >   the current one.
> > 
> > Signed-off-by: Imre Deak <imre.deak@intel.com>
> > ---
> >  include/linux/scatterlist.h |   50 +++++++++++++++++++++++++++++++++++++++++++
> >  1 file changed, 50 insertions(+)
> > 
> > diff --git a/include/linux/scatterlist.h b/include/linux/scatterlist.h
> > index 4bd6c06..72578b5 100644
> > --- a/include/linux/scatterlist.h
> > +++ b/include/linux/scatterlist.h
> > @@ -231,6 +231,56 @@ size_t sg_copy_to_buffer(struct scatterlist *sgl, unsigned int nents,
> >   */
> >  #define SG_MAX_SINGLE_ALLOC		(PAGE_SIZE / sizeof(struct scatterlist))
> >  
> > +struct sg_page_iter {
> > +	struct scatterlist *sg;
> > +	int sg_pgoffset;
> > +	struct page *page;
> > +};
> 
> Some documentation wouldn't hurt.   What it's used for, why it exists.

Ok, will add it.

> 
> > +static inline int
> > +sg_page_cnt(struct scatterlist *sg)
> 
> unneeded newline here.
> 
> A more typical name would be "sg_page_count".  Stripping words of their
> vowels makes the symbols harder to remember.

Ok, will fix this.

> > +{
> > +	BUG_ON(sg->offset || sg->length & ~PAGE_MASK);
> > +
> > +	return sg->length >> PAGE_SHIFT;
> > +}
> > +
> > +static inline struct page *
> > +sg_page_iter_get_page(struct sg_page_iter *iter)
> > +{
> > +	while (iter->sg && iter->sg_pgoffset >= sg_page_cnt(iter->sg)) {
> > +		iter->sg_pgoffset -= sg_page_cnt(iter->sg);
> > +		iter->sg = sg_next(iter->sg);
> > +	}
> > +
> > +	return iter->sg ? nth_page(sg_page(iter->sg), iter->sg_pgoffset) : NULL;
> > +}
> > +
> > +static inline void
> > +sg_page_iter_next(struct sg_page_iter *iter)
> > +{
> > +	iter->sg_pgoffset++;
> > +	iter->page = sg_page_iter_get_page(iter);
> > +}
> > +
> > +static inline void
> > +sg_page_iter_start(struct sg_page_iter *iter, struct scatterlist *sglist,
> > +		   unsigned long pgoffset)
> > +{
> > +	iter->sg = sglist;
> > +	iter->sg_pgoffset = pgoffset;
> > +	iter->page = sg_page_iter_get_page(iter);
> > +}
> 
> All the above are undocumented also.  I guess that's acceptable if they
> are only ever to be used by for_each_sg_page().  Although if that's the
> case then perhaps the identifiers should be a bit more obscure-looking.
> Usually we prefix them with "__" to say "this is in internal thing".

Yes, they are meant to be used only internally, so I'll add the __
prefix.

> > +/*
> > + * Simple sg page iterator, starting off at the given page offset. Each entry
> > + * on the sglist must start at offset 0 and can contain only full pages.
> > + * iter->page will point to the current page, iter->sg_pgoffset to the page
> > + * offset within the sg holding that page.
> > + */
> > +#define for_each_sg_page(sglist, iter, pgoffset)				\
> > +	for (sg_page_iter_start((iter), (sglist), (pgoffset));		\
> > +	     (iter)->page; sg_page_iter_next(iter))
> 
> Because all the helper functions are inlined, this will expand to a
> quite large amount of code.  And large code can be slow code due to
> I-cache eviction.
> 
> I don't know *how* big this thing will be because the patch didn't
> include a caller and I can't be bothered writing my own.  (And the lack
> of any caller means that the code will not be tested).
> 
> So, exactly how big is this thing, and how do we know it's better this
> way than if we were to uninline some/all of the helpers?

I admit I only hoped compiler optimization would keep the inlined parts
at a minimum, but now I actually checked (on Intel CPU). I applied the
patchset from [1] and uninlined sg_page_iter_start as it's not
significant for speed:

 size drivers/gpu/drm/i915/i915.ko
 514855	  15996	    272	 531123	  81ab3	drivers/gpu/drm/i915/i915.ko

Then uninlined all helpers:
 size drivers/gpu/drm/i915/i915.ko
 513447	  15996	    272	 529715	  81533	drivers/gpu/drm/i915/i915.ko

Since there are 8 invocations of the macro, the overhead for a single
invocation is about (531123 - 529715) / 8 = 191 bytes.

For speed, I benchmarked a simple loop which was basically:

page = vmalloc(sizeof(*page) * 1000, GFP_KERNEL);
for_each_sg_page(sglist, iter, 0)
	*page++ = iter.page;

where each entry on the sglist contained 16 consecutive pages. This
takes ~10% more time for the uninlined version to run. This is a rather
artificial test and I couldn't come up with something more real-life
using only the i915 driver's ioctl interface that would show a
significant change in speed.

So at least for now I'm ok with just uninlining all the helpers.

Thanks for the review,
Imre

[1]
http://lists.freedesktop.org/archives/intel-gfx/2013-February/024589.html


  reply	other threads:[~2013-02-12 17:07 UTC|newest]

Thread overview: 27+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2012-11-22 13:25 [i-g-t PATCH 0/4] flip_test: support for monotonic timestamps Imre Deak
2012-11-22 13:25 ` [i-g-t PATCH 1/4] flip_test: add wf-vblank test for expired sequence Imre Deak
2012-11-22 13:25 ` [i-g-t PATCH 2/4] flip_test: skip check for last_received_ts for the first event Imre Deak
2012-11-22 13:25 ` [i-g-t PATCH 3/4] flip_test: use monotonic time to measure the test duration Imre Deak
2012-11-22 13:52   ` Daniel Vetter
2012-11-22 13:25 ` [i-g-t PATCH 4/4] flip_test: switch to using monotonic timestamps Imre Deak
2012-11-22 13:46   ` Daniel Vetter
2012-11-22 13:50     ` Imre Deak
2012-11-22 14:04       ` Daniel Vetter
2012-11-22 14:07         ` Imre Deak
2012-11-22 14:46   ` [i-g-t PATCH 4/4] flip_test: switch to using monotonic timestamps (v2) Imre Deak
2012-11-22 20:20     ` Daniel Vetter
2013-02-11 16:32   ` [PATCH] lib/scatterlist: add simple page iterator Imre Deak
2013-02-11 18:50   ` [PATCH v2] " Imre Deak
2013-02-11 20:54     ` Andrew Morton
2013-02-12 17:07       ` Imre Deak [this message]
2013-02-12 17:13         ` Tejun Heo
2013-02-13 14:51           ` Imre Deak
2013-02-12 21:28         ` Andrew Morton
2013-02-13 15:10     ` [PATCH v3 1/2] " Imre Deak
2013-02-13 15:10     ` [PATCH v3 2/2] lib/scatterlist: use page iterator in the mapping iterator Imre Deak
2013-02-21 13:58       ` [PATCH v4] " Imre Deak
2013-02-24 11:05         ` [PATCH v5] " Imre Deak
2013-02-27  2:30           ` Stephen Warren
2013-02-23  4:29       ` [PATCH v3 2/2] " Stephen Warren
2013-02-23 20:04         ` Imre Deak
2013-02-23 23:45         ` Stephen Warren

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=1360688840.5578.75.camel@localhost \
    --to=imre.deak@intel.com \
    --cc=akpm@linux-foundation.org \
    --cc=daniel.vetter@ffwll.ch \
    --cc=linaro-mm-sig@lists.linaro.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=maximlevitsky@gmail.com \
    --cc=tj@kernel.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.