All of lore.kernel.org
 help / color / mirror / Atom feed
From: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
To: Chris Wilson <chris@chris-wilson.co.uk>,
	intel-gfx@lists.freedesktop.org,
	Daniel Vetter <daniel.vetter@ffwll.ch>
Subject: Re: [PATCH v2] drm/i915: Store a direct lookup from object handle to vma
Date: Wed, 22 Mar 2017 16:22:38 +0200	[thread overview]
Message-ID: <1490192558.2802.63.camel@linux.intel.com> (raw)
In-Reply-To: <20170322093347.2593-1-chris@chris-wilson.co.uk>

+ Daniel for the rsvd2

On ke, 2017-03-22 at 09:33 +0000, Chris Wilson wrote:
> The advent of full-ppgtt lead to an extra indirection between the object
> and its binding. That extra indirection has a noticeable impact on how
> fast we can convert from the user handles to our internal vma for
> execbuffer. In order to bypass the extra indirection, we use a
> resizable hashtable to jump from the object to the per-ctx vma.
> rhashtable was considered but we don't need the online resizing feature
> and the extra complexity proved to undermine its usefulness. Instead, we
> simply reallocate the hastable on demand in a background task and
> serialize it before iterating.
> 
> In non-full-ppgtt modes, multiple files and multiple contexts can share
> the same vma. This leads to having multiple possible handle->vma links,
> so we only use the first to establish the fast path. The majority of
> buffers are not shared and so we should still be able to realise
> speedups with multiple clients.
> 
> v2: Prettier names, more magic.
> 
> Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>

<SNIP>
 
> +static void resize_vma_ht(struct work_struct *work)
> +{
> +	struct i915_gem_context_vma_lut *lut =
> +		container_of(work, typeof(*lut), resize);
> +	unsigned int size, bits, new_bits, i;
> +	struct hlist_head *new_ht;
> +
> +	bits = 1 + ilog2(4*lut->ht_count/3);
> +	new_bits = min_t(unsigned int,
> +			 max(bits, VMA_HT_BITS),
> +			 sizeof(unsigned int)*8);

* BITS_PER_BYTE for extra clarity.

> +	if (new_bits == lut->ht_bits)
> +		goto out;
> +
> +	new_ht = kzalloc(sizeof(*new_ht)<<new_bits, GFP_KERNEL | __GFP_NOWARN);
> +	if (!new_ht)
> +		new_ht = vzalloc(sizeof(*new_ht)<<new_bits);

No vcalloc :( Otherwise would've suggested

vzalloc(BIT(new_bits), sizeof(*new_ht), ...);

but

kzalloc(BIT(new_bits)*sizeof(*new_ht), ...)

might still be clearer.

> @@ -266,6 +331,16 @@ __create_hw_context(struct drm_i915_private *dev_priv,
>  	list_add_tail(&ctx->link, &dev_priv->context_list);
>  	ctx->i915 = dev_priv;
>  
> +	ctx->vma_lut.ht_bits = VMA_HT_BITS;
> +	ctx->vma_lut.ht_size = BIT(VMA_HT_BITS);
> +	ctx->vma_lut.ht = kcalloc(ctx->vma_lut.ht_size,
> +				  sizeof(*ctx->vma_lut.ht),
> +				  GFP_KERNEL);
> +	if (!ctx->vma_lut.ht)
> +		goto err_out;
> +

Errors after this point will leak lut. Need err_lut label and call it
from further error gotos.

> @@ -143,6 +143,31 @@ struct i915_gem_context {
>  	/** ggtt_offset_bias: placement restriction for context objects */
>  	u32 ggtt_offset_bias;
>  
> +	struct i915_gem_context_vma_lut {
> +		/** ht_size: last request size to allocate the hashtable for. */
> +		unsigned int ht_size;
> +#define RESIZE_IN_PROGRESS BIT(0)

Easily conflicting name? Forward declare and hide in .c file? By making
ht[0] at the end, and maybe pull out work. Or maybe just prefix :P

> +#define to_ptr(T, x) ((T *)(uintptr_t)(x))

i915_utils.h so we some day push to core. from_uintptr might make more
sense, though.

The remainder is somewhat hard to review due to combined code motion
and changes becoming mess, but didn't find anything else but I still
dislike rsvd2 usage, and the magic bit it has.

We could at least #define proper_name rsvd2 in the .c file... Jani
would be glad, I guess. I think if somebody touches a reserved field by
name, they deserve to have their build broken.

Regards, Joonas
-- 
Joonas Lahtinen
Open Source Technology Center
Intel Corporation
_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

  reply	other threads:[~2017-03-22 14:22 UTC|newest]

Thread overview: 39+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2017-03-16 13:19 Make execbuf fast and green Chris Wilson
2017-03-16 13:19 ` [PATCH 01/15] drm/i915: Copy user requested buffers into the error state Chris Wilson
2017-03-16 13:19 ` [PATCH 02/15] drm/i915: Retire an active batch pool object rather than allocate new Chris Wilson
2017-03-17  8:52   ` Joonas Lahtinen
2017-03-17  9:02     ` Chris Wilson
2017-03-16 13:19 ` [PATCH 03/15] drm/i915: Amalgamate execbuffer parameter structures Chris Wilson
2017-03-17 10:04   ` Joonas Lahtinen
2017-03-16 13:19 ` [PATCH 04/15] drm/i915: Use vma->exec_entry as our double-entry placeholder Chris Wilson
2017-03-16 13:19 ` [PATCH 05/15] drm/i915: Split vma exec_link/evict_link Chris Wilson
2017-03-16 13:19 ` [PATCH 06/15] drm/i915: Stop using obj->obj_exec_link outside of execbuf Chris Wilson
2017-03-16 13:19 ` [PATCH 07/15] drm/i915: Store a direct lookup from object handle to vma Chris Wilson
2017-03-22  9:33   ` [PATCH v2] " Chris Wilson
2017-03-22 14:22     ` Joonas Lahtinen [this message]
2017-03-23 14:23       ` Daniel Vetter
2017-03-23 14:40         ` Chris Wilson
2017-03-16 13:19 ` [PATCH 08/15] drm/i915: Pass vma to relocate entry Chris Wilson
2017-03-21 14:17   ` Joonas Lahtinen
2017-03-23 13:02     ` [PATCH] " Chris Wilson
2017-03-16 13:20 ` [PATCH 09/15] drm/i915: Eliminate lots of iterations over the execobjects array Chris Wilson
2017-03-16 13:20 ` [PATCH 10/15] drm/i915: First try the previous execbuffer location Chris Wilson
2017-03-21 13:54   ` Joonas Lahtinen
2017-03-16 13:20 ` [PATCH 11/15] drm/i915: Wait upon userptr get-user-pages within execbuffer Chris Wilson
2017-03-16 13:20 ` [PATCH 12/15] drm/i915: Remove superfluous i915_add_request_no_flush() helper Chris Wilson
2017-03-17 11:17   ` Joonas Lahtinen
2017-03-17 13:06     ` Chris Wilson
2017-03-16 13:20 ` [PATCH 13/15] drm/i915: Allow execbuffer to use the first object as the batch Chris Wilson
2017-03-17 11:15   ` Joonas Lahtinen
2017-07-07 10:17     ` Daniel Vetter
2017-07-07 11:58       ` Chris Wilson
2017-07-10  5:55         ` Daniel Vetter
2017-07-08  3:54       ` Kenneth Graunke
2017-03-16 13:20 ` [PATCH 14/15] drm/i915: Async GPU relocation processing Chris Wilson
2017-03-17 11:11   ` [PATCH v2] " Chris Wilson
2017-03-17 11:15   ` [PATCH 14/15] " Joonas Lahtinen
2017-03-16 13:20 ` [PATCH 15/15] drm/i915/scheduler: Support user-defined priorities Chris Wilson
2017-03-16 13:57 ` ✗ Fi.CI.BAT: warning for series starting with [01/15] drm/i915: Copy user requested buffers into the error state Patchwork
2017-03-17 11:28 ` ✓ Fi.CI.BAT: success for series starting with [01/15] drm/i915: Copy user requested buffers into the error state (rev2) Patchwork
2017-03-22  9:43 ` ✗ Fi.CI.BAT: failure for series starting with [01/15] drm/i915: Copy user requested buffers into the error state (rev3) Patchwork
2017-03-23 13:51 ` ✗ Fi.CI.BAT: failure for series starting with [01/15] drm/i915: Copy user requested buffers into the error state (rev4) Patchwork

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=1490192558.2802.63.camel@linux.intel.com \
    --to=joonas.lahtinen@linux.intel.com \
    --cc=chris@chris-wilson.co.uk \
    --cc=daniel.vetter@ffwll.ch \
    --cc=intel-gfx@lists.freedesktop.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.