Re: [PATCH 05/24] drm/i915: Replace the array of pages with a scatterlist

public inbox for intel-gfx@lists.freedesktop.org
 help / color / mirror / Atom feed

From: Chris Wilson <chris@chris-wilson.co.uk>
To: Ben Widawsky <ben@bwidawsk.net>
Cc: intel-gfx@lists.freedesktop.org
Subject: Re: [PATCH 05/24] drm/i915: Replace the array of pages with a scatterlist
Date: Mon, 10 Sep 2012 17:34:48 +0100	[thread overview]
Message-ID: <275ffc$6hu1f3@fmsmga002.fm.intel.com> (raw)
In-Reply-To: <20120906184924.43b2a21a@bwidawsk.net>

On Thu, 6 Sep 2012 18:49:24 -0700, Ben Widawsky <ben@bwidawsk.net> wrote:
> On Tue,  4 Sep 2012 21:02:57 +0100
> Chris Wilson <chris@chris-wilson.co.uk> wrote:
> 
> > Rather than have multiple data structures for describing our page layout
> > in conjunction with the array of pages, we can migrate all users over to
> > a scatterlist.
> > 
> > One major advantage, other than unifying the page tracking structures,
> > this offers is that we replace the vmalloc'ed array (which can be up to
> > a megabyte in size) with a chain of individual pages which helps reduce
> > memory pressure.
> > 
> > The disadvantage is that we then do not have a simple array to iterate,
> > or to access randomly. The common case for this is in the relocation
> > processing, which will typically fit within a single scatterlist page
> > and so be almost the same cost as the simple array. For iterating over
> > the array, the extra function call could be optimised away, but in
> > reality is an insignificant cost of either binding the pages, or
> > performing the pwrite/pread.
> > 
> > Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
> 
> 
> Now that my eyes are done bleeding, easy ones:
> 
> ERROR: space required after that ',' (ctx:VxV)
> #69: FILE: drivers/char/agp/intel-gtt.c:99:
> +	for_each_sg(st->sgl, sg, num_entries,i)
>  	                                    ^
> 
> WARNING: Prefer pr_err(... to printk(KERN_ERR, ...
> #189: FILE: drivers/gpu/drm/drm_cache.c:117:
> +		printk(KERN_ERR "Timed out waiting for cache
> flush.\n");
> 
> WARNING: Prefer pr_err(... to printk(KERN_ERR, ...
> #191: FILE: drivers/gpu/drm/drm_cache.c:119:
> +	printk(KERN_ERR "Architecture has no drm_cache.c support\n");

Hmm, the drm_cache one is tricky as it is a continuation of the style of
the file and so is probably best kept and then the whole file fixed to
follow the new conventions.

> In addition to the inline comments, it would have been even slightly
> easier to review without the s/page/i since it seems to just be for no
> compelling reason anyway.

It was motivated by using the common idiom for for_each_sg() and by
allowing 'struct page *page' as being the natural local variable within
the loop. So I think the end result justifies the small amount of extra
churn in the patch.

> >  	if (intel_private.base.needs_dmar) {
> > -		ret = intel_gtt_map_memory(mem->pages, mem->page_count,
> > -					   &mem->sg_list, &mem->num_sg);
> > +		struct sg_table st;
> > +
> > +		ret = intel_gtt_map_memory(mem->pages, mem->page_count, &st);
> >  		if (ret != 0)
> >  			return ret;
> >  
> > -		intel_gtt_insert_sg_entries(mem->sg_list, mem->num_sg,
> > -					    pg_start, type);
> > +		intel_gtt_insert_sg_entries(&st, pg_start, type);
> > +		mem->sg_list = st.sgl;
> > +		mem->num_sg = st.nents;
> 
> Can you explain how the corresponding free for the sg_table gets called
> here?

The sg_table is just a small placeholder that is reconstructed in
intel_gtt_unmap_memory() for sg_free_table().

> > @@ -1749,20 +1771,27 @@ i915_gem_object_get_pages_gtt(struct drm_i915_gem_object *obj)
> >  	BUG_ON(obj->base.read_domains & I915_GEM_GPU_DOMAINS);
> >  	BUG_ON(obj->base.write_domain & I915_GEM_GPU_DOMAINS);
> >  
> > -	/* Get the list of pages out of our struct file.  They'll be pinned
> > -	 * at this point until we release them.
> > -	 */
> > +	st = kmalloc(sizeof(*st), GFP_KERNEL);
> > +	if (st == NULL)
> > +		return -ENOMEM;
> > +
> >  	page_count = obj->base.size / PAGE_SIZE;
> > -	obj->pages = drm_malloc_ab(page_count, sizeof(struct page *));
> > -	if (obj->pages == NULL)
> > +	if (sg_alloc_table(st, page_count, GFP_KERNEL)) {
> > +		sg_free_table(st);
> > +		kfree(st);
> >  		return -ENOMEM;
> > +	}
> 
> I think the call here to sg_free_table is bogus.

Experience says otherwise ;-)

The reason is that the sg_alloc_table chains together its individual
page allocations but doesn't perform any unwind if one fails before
reporting the error. sg_free_table() does the right thing in those
circumstances.

> > -	/* link the pages into an SG then map the sg */
> > -	sg = drm_prime_pages_to_sg(obj->pages, npages);
> > -	nents = dma_map_sg(attachment->dev, sg->sgl, sg->nents, dir);
> >  	i915_gem_object_pin_pages(obj);
> 
> <bikeshed>
> I think the right way to go about this is to add rm_prime_pages_to_st
> since you're pushing the whole st>sg thing, other drivers can leverage
> it.
> </bikeshed>

Quite possibly true, but the code will change later and lose some of its
generality. Or at least no else is like i915 yet.
 
> The lifetime description we discussed on IRC would have helped here as
> well.

> >  static void i915_gem_unmap_dma_buf(struct dma_buf_attachment *attachment,
> > @@ -80,7 +104,9 @@ static void *i915_gem_dmabuf_vmap(struct dma_buf *dma_buf)
> >  {
> >  	struct drm_i915_gem_object *obj = dma_buf->priv;
> >  	struct drm_device *dev = obj->base.dev;
> > -	int ret;
> > +	struct scatterlist *sg;
> > +	struct page **pages;
> > +	int ret, i;
> >  
> >  	ret = i915_mutex_lock_interruptible(dev);
> >  	if (ret)
> > @@ -92,22 +118,33 @@ static void *i915_gem_dmabuf_vmap(struct dma_buf *dma_buf)
> >  	}
> >  
> >  	ret = i915_gem_object_get_pages(obj);
> > -	if (ret) {
> > -		mutex_unlock(&dev->struct_mutex);
> > -		return ERR_PTR(ret);
> > -	}
> > +	if (ret)
> > +		goto error;
> >  
> > -	obj->dma_buf_vmapping = vmap(obj->pages, obj->base.size / PAGE_SIZE, 0, PAGE_KERNEL);
> > -	if (!obj->dma_buf_vmapping) {
> > -		DRM_ERROR("failed to vmap object\n");
> > -		goto out_unlock;
> > -	}
> > +	ret = -ENOMEM;
> > +
> > +	pages = drm_malloc_ab(obj->pages->nents, sizeof(struct page *));
> > +	if (pages == NULL)
> > +		goto error;
> > +
> > +	for_each_sg(obj->pages->sgl, sg, obj->pages->nents, i)
> > +		pages[i] = sg_page(sg);
> > +
> > +	obj->dma_buf_vmapping = vmap(pages, obj->pages->nents, 0, PAGE_KERNEL);
> > +	drm_free_large(pages);
> > +
> > +	if (!obj->dma_buf_vmapping)
> > +		goto error;
> >  
> >  	obj->vmapping_count = 1;
> >  	i915_gem_object_pin_pages(obj);
> >  out_unlock:
> >  	mutex_unlock(&dev->struct_mutex);
> >  	return obj->dma_buf_vmapping;
> > +
> > +error:
> > +	mutex_unlock(&dev->struct_mutex);
> > +	return ERR_PTR(ret);
> >  }
> >  
> >  static void i915_gem_dmabuf_vunmap(struct dma_buf *dma_buf, void *vaddr)
> 
> The return on vmap failing looks incorrect to me here. Also, I think
> leaving the DRM_ERROR would have been nice.

Since we already return the ERR_PTR(-ENOMEM) we are not breaking any
semantics by reporting the oom for vmap as well. And yes it would be
nice if vmap gave a specific error as well. So other than the change to
an explicit errno, I'm not sure what mistake you are point out.

In this case the DRM_ERROR has an obvious errno returned to userspace,
much more informative.
> 
> > @@ -270,26 +233,10 @@ void i915_ppgtt_bind_object(struct i915_hw_ppgtt *ppgtt,
> >  		BUG();
> >  	}
> >  
> > -	if (obj->sg_table) {
> > -		i915_ppgtt_insert_sg_entries(ppgtt,
> > -					     obj->sg_table->sgl,
> > -					     obj->sg_table->nents,
> > -					     obj->gtt_space->start >> PAGE_SHIFT,
> > -					     pte_flags);
> > -	} else if (dev_priv->mm.gtt->needs_dmar) {
> > -		BUG_ON(!obj->sg_list);
> > -
> > -		i915_ppgtt_insert_sg_entries(ppgtt,
> > -					     obj->sg_list,
> > -					     obj->num_sg,
> > -					     obj->gtt_space->start >> PAGE_SHIFT,
> > -					     pte_flags);
> > -	} else
> > -		i915_ppgtt_insert_pages(ppgtt,
> > -					obj->gtt_space->start >> PAGE_SHIFT,
> > -					obj->base.size >> PAGE_SHIFT,
> > -					obj->pages,
> > -					pte_flags);
> > +	i915_ppgtt_insert_sg_entries(ppgtt,
> > +				     obj->sg_table ?: obj->pages,
> > +				     obj->gtt_space->start >> PAGE_SHIFT,
> > +				     pte_flags);
> >  }
> 
> I got lost here. Is it, if there is a prime sg_table use that, otherwise
> just use the object's sgt? If so, I think has_dma_mapping is more
> readable.
> Also, I wonder if ?: pissed off the clang people?

Right, this is just a step along the path to enlightment. 2 out of the 3
paths now use obj->pages with the dmabuf being the only exception to
still create an obj->sg_table scatterlist. '?:' is widely used by the
kernel, if clang doesn't yet support it, that's their problem. But rest
assured it is removed in a couple of patches after migrating dmabuf over
to the page ops.
-Chris

-- 
Chris Wilson, Intel Open Source Technology Centre

next prev parent reply	other threads:[~2012-09-10 16:35 UTC|newest]

Thread overview: 55+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2012-09-04 20:02 Stolen memory, again Chris Wilson
2012-09-04 20:02 ` [PATCH 01/24] drm/i915: Introduce drm_i915_gem_object_ops Chris Wilson
2012-09-06 22:32   ` Ben Widawsky
2012-10-11 18:28   ` Jesse Barnes
2012-09-04 20:02 ` [PATCH 02/24] drm/i915: Pin backing pages whilst exporting through a dmabuf vmap Chris Wilson
2012-09-06 22:55   ` Ben Widawsky
2012-10-11 18:30   ` Jesse Barnes
2012-09-04 20:02 ` [PATCH 03/24] drm/i915: Pin backing pages for pwrite Chris Wilson
2012-09-07  0:07   ` Ben Widawsky
2012-09-12 13:13     ` Daniel Vetter
2012-09-12 13:20       ` Daniel Vetter
2012-10-11 18:31   ` Jesse Barnes
2012-09-04 20:02 ` [PATCH 04/24] drm/i915: Pin backing pages for pread Chris Wilson
2012-09-07  0:10   ` Ben Widawsky
2012-09-04 20:02 ` [PATCH 05/24] drm/i915: Replace the array of pages with a scatterlist Chris Wilson
2012-09-07  1:49   ` Ben Widawsky
2012-09-10 16:34     ` Chris Wilson [this message]
2012-09-12 13:33       ` Daniel Vetter
2012-09-04 20:02 ` [PATCH 06/24] drm/i915: Convert the dmabuf object to use the new i915_gem_object_ops Chris Wilson
2012-09-14 18:02   ` Ben Widawsky
2012-09-14 18:24     ` Chris Wilson
2012-09-14 21:43   ` Daniel Vetter
2012-09-04 20:02 ` [PATCH 07/24] drm: Introduce drm_mm_create_block() Chris Wilson
2012-09-12 13:43   ` Daniel Vetter
2012-09-04 20:03 ` [PATCH 08/24] drm/i915: Fix detection of stolen base for gen2 Chris Wilson
2012-09-04 20:03 ` [PATCH 09/24] drm/i915: Fix location of stolen memory register for SandyBridge+ Chris Wilson
2012-10-11 18:43   ` Jesse Barnes
2012-10-11 19:06     ` Jesse Barnes
2012-09-04 20:03 ` [PATCH 10/24] drm/i915: Avoid clearing preallocated regions from the GTT Chris Wilson
2012-10-11 18:45   ` Jesse Barnes
2012-09-04 20:03 ` [PATCH 11/24] drm: Introduce an iterator over holes in the drm_mm range manager Chris Wilson
2012-09-12 13:54   ` Daniel Vetter
2012-09-04 20:03 ` [PATCH 12/24] drm/i915: Delay allocation of stolen space for FBC Chris Wilson
2012-10-11 18:49   ` Jesse Barnes
2012-10-11 18:56     ` Chris Wilson
2012-09-04 20:03 ` [PATCH 13/24] drm/i915: Defer allocation of stolen memory for FBC until first use Chris Wilson
2012-09-04 20:03 ` [PATCH 14/24] drm/i915: Allow objects to be created with no backing pages, but stolen space Chris Wilson
2012-09-04 20:03 ` [PATCH 15/24] drm/i915: Differentiate between prime and stolen objects Chris Wilson
2012-10-11 18:50   ` Jesse Barnes
2012-09-04 20:03 ` [PATCH 16/24] drm/i915: Support readback of stolen objects upon error Chris Wilson
2012-10-11 18:51   ` Jesse Barnes
2012-09-04 20:03 ` [PATCH 17/24] drm/i915: Handle stolen objects in pwrite Chris Wilson
2012-09-04 20:03 ` [PATCH 18/24] drm/i915: Handle stolen objects for pread Chris Wilson
2012-09-04 20:03 ` [PATCH 19/24] drm/i915: Introduce i915_gem_object_create_stolen() Chris Wilson
2012-10-11 18:53   ` Jesse Barnes
2012-09-04 20:03 ` [PATCH 20/24] drm/i915: Allocate fbcon from stolen memory Chris Wilson
2012-10-11 18:54   ` Jesse Barnes
2012-09-04 20:03 ` [PATCH 21/24] drm/i915: Allocate ringbuffers " Chris Wilson
2012-10-11 18:54   ` Jesse Barnes
2012-09-04 20:03 ` [PATCH 22/24] drm/i915: Allocate overlay registers " Chris Wilson
2012-10-11 18:55   ` Jesse Barnes
2012-09-04 20:03 ` [PATCH 23/24] drm/i915: Use a slab for object allocation Chris Wilson
2012-10-11 18:55   ` Jesse Barnes
2012-09-04 20:03 ` [PATCH 24/24] drm/i915: Introduce mapping of user pages into video memory (userptr) ioctl Chris Wilson
  -- strict thread matches above, loose matches on Subject: below --
2012-08-30 15:30 Next iteration of stolen support Chris Wilson
2012-08-30 15:31 ` [PATCH 05/24] drm/i915: Replace the array of pages with a scatterlist Chris Wilson

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to='275ffc$6hu1f3@fmsmga002.fm.intel.com' \
    --to=chris@chris-wilson.co.uk \
    --cc=ben@bwidawsk.net \
    --cc=intel-gfx@lists.freedesktop.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link

Be sure your reply has a Subject: header at the top and a blank line before the message body.

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox