From: "Liu, Hong" <hong.liu@intel.com>
To: "Gordon, David S" <david.s.gordon@intel.com>,
"chris@chris-wilson.co.uk" <chris@chris-wilson.co.uk>
Cc: "intel-gfx@lists.freedesktop.org" <intel-gfx@lists.freedesktop.org>
Subject: Re: [PATCH] drm/i915: tidy up request alloc
Date: Mon, 4 Jul 2016 04:08:34 +0000 [thread overview]
Message-ID: <1467605321.2003.9.camel@intel.com> (raw)
In-Reply-To: <20160701183451.GB27799@nuc-i3427.alporthouse.com>
On Fri, 2016-07-01 at 19:34 +0100, Chris Wilson wrote:
> On Fri, Jul 01, 2016 at 05:58:18PM +0100, Dave Gordon wrote:
> > On 30/06/16 13:49, Tvrtko Ursulin wrote:
> > >
> > > On 30/06/16 11:22, Chris Wilson wrote:
> > > > On Thu, Jun 30, 2016 at 09:50:20AM +0100, Tvrtko Ursulin wrote:
> > > > >
> > > > > On 30/06/16 02:35, Hong Liu wrote:
> > > > > > Return the allocated request pointer directly to remove
> > > > > > the double pointer parameter.
> > > > > >
> > > > > > Signed-off-by: Hong Liu <hong.liu@intel.com>
> > > > > > ---
> > > > > > drivers/gpu/drm/i915/i915_gem.c | 25 +++++++------------
> > > > > > ------
> > > > > > 1 file changed, 7 insertions(+), 18 deletions(-)
> > > > > >
> > > > > > diff --git a/drivers/gpu/drm/i915/i915_gem.c
> > > > > > b/drivers/gpu/drm/i915/i915_gem.c
> > > > > > index 1d98782..9881455 100644
> > > > > > --- a/drivers/gpu/drm/i915/i915_gem.c
> > > > > > +++ b/drivers/gpu/drm/i915/i915_gem.c
> > > > > > @@ -2988,32 +2988,26 @@ void i915_gem_request_free(struct
> > > > > > kref
> > > > > > *req_ref)
> > > > > > kmem_cache_free(req->i915->requests, req);
> > > > > > }
> > > > > >
> > > > > > -static inline int
> > > > > > +static inline struct drm_i915_gem_request *
> > > > > > __i915_gem_request_alloc(struct intel_engine_cs *engine,
> > > > > > - struct i915_gem_context *ctx,
> > > > > > - struct drm_i915_gem_request **req_out)
> > > > > > + struct i915_gem_context *ctx)
> > > > > > {
> > > > > > struct drm_i915_private *dev_priv = engine->i915;
> > > > > > unsigned reset_counter =
> > > > > > i915_reset_counter(&dev_priv->gpu_error);
> > > > > > struct drm_i915_gem_request *req;
> > > > > > int ret;
> > > > > >
> > > > > > - if (!req_out)
> > > > > > - return -EINVAL;
> > > > > > -
> > > > > > - *req_out = NULL;
> > > > > > -
> > > > > > /* ABI: Before userspace accesses the GPU (e.g.
> > > > > > execbuffer),
> > > > > > report
> > > > > > * EIO if the GPU is already wedged, or EAGAIN to drop
> > > > > > the
> > > > > > struct_mutex
> > > > > > * and restart.
> > > > > > */
> > > > > > ret = i915_gem_check_wedge(reset_counter,
> > > > > > dev_priv->mm.interruptible);
> > > > > > if (ret)
> > > > > > - return ret;
> > > > > > + return ERR_PTR(ret);
> > > > > >
> > > > > > req = kmem_cache_zalloc(dev_priv->requests,
> > > > > > GFP_KERNEL);
> > > > > > if (req == NULL)
> > > > > > - return -ENOMEM;
> > > > > > + return ERR_PTR(-ENOMEM);
> > > > > >
> > > > > > ret = i915_gem_get_seqno(engine->i915, &req->seqno);
> > > > > > if (ret)
> > > > > > @@ -3041,14 +3035,13 @@ __i915_gem_request_alloc(struct
> > > > > > intel_engine_cs *engine,
> > > > > > if (ret)
> > > > > > goto err_ctx;
> > > > > >
> > > > > > - *req_out = req;
> > > > > > - return 0;
> > > > > > + return req;
> > > > > >
> > > > > > err_ctx:
> > > > > > i915_gem_context_unreference(ctx);
> > > > > > err:
> > > > > > kmem_cache_free(dev_priv->requests, req);
> > > > > > - return ret;
> > > > > > + return ERR_PTR(ret);
> > > > > > }
> > > > > >
> > > > > > /**
> > > > > > @@ -3067,13 +3060,9 @@ struct drm_i915_gem_request *
> > > > > > i915_gem_request_alloc(struct intel_engine_cs *engine,
> > > > > > struct i915_gem_context *ctx)
> > > > > > {
> > > > > > - struct drm_i915_gem_request *req;
> > > > > > - int err;
> > > > > > -
> > > > > > if (ctx == NULL)
> > > > > > ctx = engine->i915->kernel_context;
> > > > > > - err = __i915_gem_request_alloc(engine, ctx, &req);
> > > > > > - return err ? ERR_PTR(err) : req;
> > > > > > + return __i915_gem_request_alloc(engine, ctx);
> > > > > > }
> > > > > >
> > > > > > struct drm_i915_gem_request *
> > > > > >
> > > > >
> > > > > Looks good to me. And have this feeling I've seen this
> > > > > somewhere before.
> > > >
> > > > Several times. This is not the full tidy, nor does it realise
> > > > the
> > > > ramifactions of request alloc through the stack.
> > >
> > > Hm I can't spot that it is doing anything wrong or making
> > > anything
> > > worse. You don't want to let the small cleanup in?
> > >
> > > Regards,
> > > Tvrtko
> >
> > It ought to make almost no difference, because the *only* place the
> > inner function is called is from the outer one, which passes a
> > pointer to a local for the returned object; and the inner one is
> > then inlined, so the compiler doesn't actually put it on the stack
> > and call to the inner allocator anyway.
> >
> > Strangely, however, with this change the code becomes ~400 bytes
> > bigger!
> >
> > Disassembly reveals that while the code for the externally-callable
> > outer function is indeed almost identical, a second copy of it has
> > also been inlined at the one callsite in this file:
> >
> > __i915_gem_object_sync() ...
> > req = i915_gem_request_alloc(to, NULL);
> >
> > I don't think that's a critical path and would rather have 400
> > bytes
> > smaller codespace. We can get that back by adding /noinline/ to the
> > outer function i915_gem_request_alloc() (not, of course, to the
> > inner one, that definitely *should* be inline).
>
> __i915_gem_object_sync() should not be calling
> i915_gem_request_alloc().
>
> That's the issue with this patch, your patch and John's patch.
So we wrote the i915_gem_request_alloc() this way is to avoid being
inlined into callers like __i915_gem_object_sync()?
I checked the file with GCC 4.8.5 on my centos environment, it is like
what Dave found. With the patch, i915_gem_object_sync() is 368 bytes
bigger.
But when I checked it with GCC 6.1.1 on Fedora 24, it seems it inlines
the i915_gem_request_alloc() even with the current implementation.
With the patch, the i915_gem_object_sync() is 80 bytes smaller.
> -Chris
>
_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx
next prev parent reply other threads:[~2016-07-04 4:08 UTC|newest]
Thread overview: 9+ messages / expand[flat|nested] mbox.gz Atom feed top
2016-06-30 1:35 [PATCH] drm/i915: tidy up request alloc Hong Liu
2016-06-30 5:43 ` ✓ Ro.CI.BAT: success for " Patchwork
2016-06-30 8:50 ` [PATCH] " Tvrtko Ursulin
2016-06-30 10:22 ` Chris Wilson
2016-06-30 12:49 ` Tvrtko Ursulin
2016-07-01 16:58 ` Dave Gordon
2016-07-01 18:34 ` Chris Wilson
2016-07-04 4:08 ` Liu, Hong [this message]
2016-07-04 10:36 ` Dave Gordon
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=1467605321.2003.9.camel@intel.com \
--to=hong.liu@intel.com \
--cc=chris@chris-wilson.co.uk \
--cc=david.s.gordon@intel.com \
--cc=intel-gfx@lists.freedesktop.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox