All of lore.kernel.org
 help / color / mirror / Atom feed
From: Ben Widawsky <benjamin.widawsky@intel.com>
To: Matt Turner <mattst88@gmail.com>
Cc: Intel GFX <intel-gfx@lists.freedesktop.org>,
	Ben Widawsky <ben@bwidawsk.net>,
	DRI Development <dri-devel@lists.freedesktop.org>
Subject: Re: [PATCH 2/4] drm/cache: Try to be smarter about clflushing on x86
Date: Mon, 15 Dec 2014 14:07:02 -0800	[thread overview]
Message-ID: <20141215220702.GB24678@intel.com> (raw)
In-Reply-To: <CAEdQ38FD7Jqs_tcFPXQ0tQ82kfGiwr+NWNqms3ze1h1OBwJ5Tw@mail.gmail.com>

On Sat, Dec 13, 2014 at 08:15:22PM -0800, Matt Turner wrote:
> On Sat, Dec 13, 2014 at 7:08 PM, Ben Widawsky
> <benjamin.widawsky@intel.com> wrote:
> > Any GEM driver which has very large objects and a slow CPU is subject to very
> > long waits simply for clflushing incoherent objects. Generally, each individual
> > object is not a problem, but if you have very large objects, or very many
> > objects, the flushing begins to show up in profiles. Because on x86 we know the
> > cache size, we can easily determine when an object will use all the cache, and
> > forego iterating over each cacheline.
> >
> > We need to be careful when using wbinvd. wbinvd() is itself potentially slow
> > because it requires synchronizing the flush across all CPUs so they have a
> > coherent view of memory. This can result in either stalling work being done on
> > other CPUs, or this call itself stalling while waiting for a CPU to accept the
> > interrupt. Also, wbinvd() also has the downside of invalidating all cachelines,
> > so we don't want to use it unless we're sure we already own most of the
> > cachelines.
> >
> > The current algorithm is very naive. I think it can be tweaked more, and it
> > would be good if someone else gave it some thought. I am pretty confident in
> > i915, we can even skip the IPI in the execbuf path with minimal code change (or
> > perhaps just some verifying of the existing code). It would be nice to hear what
> > other developers who depend on this code think.
> >
> > Cc: Intel GFX <intel-gfx@lists.freedesktop.org>
> > Signed-off-by: Ben Widawsky <ben@bwidawsk.net>
> > ---
> >  drivers/gpu/drm/drm_cache.c | 20 +++++++++++++++++---
> >  1 file changed, 17 insertions(+), 3 deletions(-)
> >
> > diff --git a/drivers/gpu/drm/drm_cache.c b/drivers/gpu/drm/drm_cache.c
> > index d7797e8..6009c2d 100644
> > --- a/drivers/gpu/drm/drm_cache.c
> > +++ b/drivers/gpu/drm/drm_cache.c
> > @@ -64,6 +64,20 @@ static void drm_cache_flush_clflush(struct page *pages[],
> >                 drm_clflush_page(*pages++);
> >         mb();
> >  }
> > +
> > +static bool
> > +drm_cache_should_clflush(unsigned long num_pages)
> > +{
> > +       const int cache_size = boot_cpu_data.x86_cache_size;
> > +
> > +       /* For now the algorithm simply checks if the number of pages to be
> > +        * flushed is greater than the entire system cache. One could make the
> > +        * function more aware of the actual system (ie. if SMP, how large is
> > +        * the cache, CPU freq. etc. All those help to determine when to
> > +        * wbinvd() */
> > +       WARN_ON_ONCE(!cache_size);
> > +       return !cache_size || num_pages < (cache_size >> 2);
> > +}
> >  #endif
> >
> >  void
> > @@ -71,7 +85,7 @@ drm_clflush_pages(struct page *pages[], unsigned long num_pages)
> >  {
> >
> >  #if defined(CONFIG_X86)
> > -       if (cpu_has_clflush) {
> > +       if (cpu_has_clflush && drm_cache_should_clflush(num_pages)) {
> >                 drm_cache_flush_clflush(pages, num_pages);
> >                 return;
> >         }
> > @@ -104,7 +118,7 @@ void
> >  drm_clflush_sg(struct sg_table *st)
> >  {
> >  #if defined(CONFIG_X86)
> > -       if (cpu_has_clflush) {
> > +       if (cpu_has_clflush && drm_cache_should_clflush(st->nents)) {
> >                 struct sg_page_iter sg_iter;
> >
> >                 mb();
> > @@ -128,7 +142,7 @@ void
> >  drm_clflush_virt_range(void *addr, unsigned long length)
> >  {
> >  #if defined(CONFIG_X86)
> > -       if (cpu_has_clflush) {
> > +       if (cpu_has_clflush && drm_cache_should_clflush(length / PAGE_SIZE)) {
> 
> If length isn't a multiple of page size, isn't this ignoring the
> remainder? Should it be rounding length up to the next multiple of
> PAGE_SIZE, like ROUND_UP_TO?

Yeah, we could round_up. In practice it probably won't matter. I actually think
it would be better to pass a size to drm_cache_should_clflush(), and let that
round it up. It sounds like people don't want this patch anyway, so I'll make
the equivalent change in the i915 only patch.

-- 
Ben Widawsky, Intel Open Source Technology Center
_______________________________________________
dri-devel mailing list
dri-devel@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/dri-devel

  reply	other threads:[~2014-12-15 22:07 UTC|newest]

Thread overview: 25+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2014-12-14  3:08 [PATCH 0/4] [RFC] Sometimes opt for wbinvd over clflush Ben Widawsky
2014-12-14  3:08 ` [PATCH 1/4] drm/cache: Use wbinvd helpers Ben Widawsky
2014-12-14 12:43   ` Chris Wilson
2014-12-15  7:49     ` Daniel Vetter
2014-12-15 20:26   ` [PATCH] [v2] " Ben Widawsky
2014-12-16  2:26     ` shuang.he
2014-12-16  7:56     ` Daniel Vetter
2014-12-14  3:08 ` [PATCH 2/4] drm/cache: Try to be smarter about clflushing on x86 Ben Widawsky
2014-12-14  4:15   ` Matt Turner
2014-12-15 22:07     ` Ben Widawsky [this message]
2014-12-14 12:59   ` [Intel-gfx] " Chris Wilson
2014-12-15  4:06     ` Jesse Barnes
2014-12-15 19:54       ` Ben Widawsky
2014-12-14  3:08 ` [PATCH 3/4] drm/cache: Return what type of cache flush occurred Ben Widawsky
2014-12-14  3:08 ` [PATCH 4/4] drm/i915: Opportunistically reduce flushing at execbuf Ben Widawsky
2014-12-14  9:35   ` shuang.he
2014-12-14 13:12   ` [Intel-gfx] " Ville Syrjälä
2014-12-14 23:37     ` Ben Widawsky
2014-12-15  7:55       ` [Intel-gfx] " Daniel Vetter
2014-12-15  8:20         ` Chris Wilson
2014-12-15 19:56           ` Ben Widawsky
2014-12-15 20:39             ` Chris Wilson
2014-12-15 21:06               ` [Intel-gfx] " Ben Widawsky
2014-12-16  7:57                 ` Chris Wilson
2014-12-16 19:38                   ` Ben Widawsky

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20141215220702.GB24678@intel.com \
    --to=benjamin.widawsky@intel.com \
    --cc=ben@bwidawsk.net \
    --cc=dri-devel@lists.freedesktop.org \
    --cc=intel-gfx@lists.freedesktop.org \
    --cc=mattst88@gmail.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.