linux-tegra.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: "Ville Syrjälä" <ville.syrjala-VuQAYsv1563Yd54FQh9/CA@public.gmane.org>
To: Daniel Vetter <daniel-/w4YWyX8dFk@public.gmane.org>
Cc: Gurchetan Singh
	<gurchetansingh-F7+t8E8rja9g9hUCZPvPmw@public.gmane.org>,
	linux-tegra-u79uwXL29TY76Z2rM5mHXA@public.gmane.org,
	Daniel Vetter
	<daniel.vetter-ral2JQCrhuEAvxtiuMwx3w@public.gmane.org>,
	Thierry Reding
	<thierry.reding-Re5JQEeQqe8AvxtiuMwx3w@public.gmane.org>,
	Laurent Pinchart
	<laurent.pinchart-ryLnwIuWjnjg/C1BVhZhaw@public.gmane.org>,
	ML dri-devel
	<dri-devel-PD4FTy7X32lNgt0PjOBp9y5qC8QIuHrW@public.gmane.org>
Subject: Re: [PATCH 3/4] drm: add ARM flush implementations
Date: Tue, 30 Jan 2018 15:34:00 +0200	[thread overview]
Message-ID: <20180130133400.GY5453@intel.com> (raw)
In-Reply-To: <CAKMK7uHoAqYsgqOTT4O=NQFvj51M2tKp+c2Sm8t4t01QN9Jkfg-JsoAwUIsXosN+BqQ9rBEUg@public.gmane.org>

On Thu, Jan 18, 2018 at 08:42:55AM +0100, Daniel Vetter wrote:
> On Wed, Jan 17, 2018 at 11:46 PM, Gurchetan Singh
> <gurchetansingh-F7+t8E8rja9g9hUCZPvPmw@public.gmane.org> wrote:
> >> dma api just isn't quite sufficient for implementing fast gpu drivers.
> >
> >
> > Can you elaborate?  IIRC the DMA API has strong synchronization guarantees
> > and that can be problematic for GPU drivers.  However, using the DMA API for
> > flushing doesn't necessarily mean the driver has to use the rest of the DMA
> > API.
> >
> >> but then it'd need to directly flush cpu caches and bypass the dma api.
> >
> >
> > On ARM, cache flushing seems to vary from chipset to chipset.  For example,
> > on ARM32 a typical call-stack for dma_sync_single_for_device looks like:
> >
> > arm_dma_sync_single_for_device
> > __dma_page_cpu_to_dev
> > outer_clean_range
> > outer_cache.clean_range
> >
> > There are multiple clean_range implementations out there (i.e,
> > aurora_clean_range, l2c210_clean_range, feroceon_l2_clean_range), so that's
> > why the DMA API was used in this case.  On ARM64, things are a little
> > simpler, but the DMA API seems to go directly to assembly (__dma_map_area)
> > after a little indirection.  Why do you think that's inefficient?
> 
> I never said it's inefficient. My only gripe is with adding the
> pointless struct device * argument to flushing functions which really
> don't care (nor should care) about the device. Because what we really
> want to call is outer_clean_range here, and that doesn't need a struct
> device *. Imo that function (or something similar) needs to be
> exported and then used by drm_flush_* functions.
> 
> Also note that arm has both flush and invalidate functions, but on x86
> those are the same implementation (and we don't have a separate
> drm_invalidate_* set of functions). That doesn't look like a too good
> idea.

IMO if someone adds some new functions they should talk about
"writeback" and "invalidate". I think "flush" is a very vague
term that could mean different things to different people.

x86 has clflush which is writeback+invalidate, and more recently
x86 gained the clwb instruction for doing just writeback without
the invalidate. On ARM IIRC you can choose whether to do
writeback or invalidate or both.

> 
> Of course that doesn't solve the problem of who's supposed to call it
> and when in the dma-buf sharing situation.
> -Daniel
> 
> > On Wed, Jan 17, 2018 at 12:31 AM, Daniel Vetter <daniel-/w4YWyX8dFk@public.gmane.org> wrote:
> >>
> >> On Tue, Jan 16, 2018 at 04:35:58PM -0800, Gurchetan Singh wrote:
> >> > The DMA API can be used to flush scatter gather tables and physical
> >> > pages on ARM devices.
> >> >
> >> > Signed-off-by: Gurchetan Singh <gurchetansingh-F7+t8E8rja9g9hUCZPvPmw@public.gmane.org>
> >> > ---
> >> >  drivers/gpu/drm/drm_cache.c                 | 17 +++++++++++++++++
> >> >  drivers/gpu/drm/rockchip/rockchip_drm_gem.c |  7 ++-----
> >> >  drivers/gpu/drm/tegra/gem.c                 |  6 +-----
> >> >  3 files changed, 20 insertions(+), 10 deletions(-)
> >> >
> >> > diff --git a/drivers/gpu/drm/drm_cache.c b/drivers/gpu/drm/drm_cache.c
> >> > index 3d2bb9d71a60..98d6ebb40e96 100644
> >> > --- a/drivers/gpu/drm/drm_cache.c
> >> > +++ b/drivers/gpu/drm/drm_cache.c
> >> > @@ -105,6 +105,18 @@ drm_flush_pages(struct device *dev, struct page
> >> > *pages[],
> >> >                                  (unsigned long)page_virtual +
> >> > PAGE_SIZE);
> >> >               kunmap_atomic(page_virtual);
> >> >       }
> >> > +#elif defined(CONFIG_ARM) || defined(CONFIG_ARM64)
> >> > +     unsigned long i;
> >> > +     dma_addr_t dma_handle;
> >> > +
> >> > +     if (!dev)
> >> > +             return;
> >> > +
> >> > +     for (i = 0; i < num_pages; i++) {
> >> > +             dma_handle = phys_to_dma(drm->dev,
> >> > page_to_phys(pages[i]));
> >> > +             dma_sync_single_for_device(dev, dma_handle, PAGE_SIZE,
> >> > +                                        DMA_TO_DEVICE);
> >>
> >> Erm no. These functions here are super-low-level functions used by drivers
> >> which know exactly what they're doing. Which is reimplementing half of the
> >> dma api behind the dma api's back because the dma api just isn't quite
> >> sufficient for implementing fast gpu drivers.
> >>
> >> If all you end up doing is calling the dma api again, then pls just call
> >> it directly.
> >>
> >> And just to make it clear: I'd be perfectly fine with adding arm support
> >> here, but then it'd need to directly flush cpu caches and bypass the dma
> >> api. Otherwise this is pointless.
> >> -Daniel
> >>
> >> > +     }
> >> >  #else
> >> >       pr_err("Architecture has no drm_cache.c support\n");
> >> >       WARN_ON_ONCE(1);
> >> > @@ -136,6 +148,11 @@ drm_flush_sg(struct device *dev, struct sg_table
> >> > *st)
> >> >
> >> >       if (wbinvd_on_all_cpus())
> >> >               pr_err("Timed out waiting for cache flush\n");
> >> > +#elif defined(CONFIG_ARM) || defined(CONFIG_ARM64)
> >> > +     if (!dev)
> >> > +             return;
> >> > +
> >> > +     dma_sync_sg_for_device(dev, st->sgl, st->nents, DMA_TO_DEVICE);
> >> >  #else
> >> >       pr_err("Architecture has no drm_cache.c support\n");
> >> >       WARN_ON_ONCE(1);
> >> > diff --git a/drivers/gpu/drm/rockchip/rockchip_drm_gem.c
> >> > b/drivers/gpu/drm/rockchip/rockchip_drm_gem.c
> >> > index 8ac7eb25e46d..0157f90b5d10 100644
> >> > --- a/drivers/gpu/drm/rockchip/rockchip_drm_gem.c
> >> > +++ b/drivers/gpu/drm/rockchip/rockchip_drm_gem.c
> >> > @@ -14,6 +14,7 @@
> >> >
> >> >  #include <drm/drm.h>
> >> >  #include <drm/drmP.h>
> >> > +#include <drm/drm_cache.h>
> >> >  #include <drm/drm_gem.h>
> >> >  #include <drm/drm_vma_manager.h>
> >> >  #include <linux/iommu.h>
> >> > @@ -99,15 +100,11 @@ static int rockchip_gem_get_pages(struct
> >> > rockchip_gem_object *rk_obj)
> >> >       /*
> >> >        * Fake up the SG table so that dma_sync_sg_for_device() can be
> >> > used
> >> >        * to flush the pages associated with it.
> >> > -      *
> >> > -      * TODO: Replace this by drm_flush_sg() once it can be implemented
> >> > -      * without relying on symbols that are not exported.
> >> >        */
> >> >       for_each_sg(rk_obj->sgt->sgl, s, rk_obj->sgt->nents, i)
> >> >               sg_dma_address(s) = sg_phys(s);
> >> >
> >> > -     dma_sync_sg_for_device(drm->dev, rk_obj->sgt->sgl,
> >> > rk_obj->sgt->nents,
> >> > -                            DMA_TO_DEVICE);
> >> > +     drm_flush_sg(drm->dev, rk_obj->sgt);
> >> >
> >> >       return 0;
> >> >
> >> > diff --git a/drivers/gpu/drm/tegra/gem.c b/drivers/gpu/drm/tegra/gem.c
> >> > index ab1e53d434e8..9945fd2f6bd6 100644
> >> > --- a/drivers/gpu/drm/tegra/gem.c
> >> > +++ b/drivers/gpu/drm/tegra/gem.c
> >> > @@ -230,15 +230,11 @@ static int tegra_bo_get_pages(struct drm_device
> >> > *drm, struct tegra_bo *bo)
> >> >       /*
> >> >        * Fake up the SG table so that dma_sync_sg_for_device() can be
> >> > used
> >> >        * to flush the pages associated with it.
> >> > -      *
> >> > -      * TODO: Replace this by drm_clflash_sg() once it can be
> >> > implemented
> >> > -      * without relying on symbols that are not exported.
> >> >        */
> >> >       for_each_sg(bo->sgt->sgl, s, bo->sgt->nents, i)
> >> >               sg_dma_address(s) = sg_phys(s);
> >> >
> >> > -     dma_sync_sg_for_device(drm->dev, bo->sgt->sgl, bo->sgt->nents,
> >> > -                            DMA_TO_DEVICE);
> >> > +     drm_flush_sg(drm->dev, bo->sgt);
> >> >
> >> >       return 0;
> >> >
> >> > --
> >> > 2.13.5
> >> >
> >> > _______________________________________________
> >> > dri-devel mailing list
> >> > dri-devel-PD4FTy7X32lNgt0PjOBp9y5qC8QIuHrW@public.gmane.org
> >> > https://lists.freedesktop.org/mailman/listinfo/dri-devel
> >>
> >> --
> >> Daniel Vetter
> >> Software Engineer, Intel Corporation
> >> http://blog.ffwll.ch
> >
> >
> 
> 
> 
> -- 
> Daniel Vetter
> Software Engineer, Intel Corporation
> +41 (0) 79 365 57 48 - http://blog.ffwll.ch
> _______________________________________________
> dri-devel mailing list
> dri-devel-PD4FTy7X32lNgt0PjOBp9y5qC8QIuHrW@public.gmane.org
> https://lists.freedesktop.org/mailman/listinfo/dri-devel

-- 
Ville Syrjälä
Intel OTC

  parent reply	other threads:[~2018-01-30 13:34 UTC|newest]

Thread overview: 17+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2018-01-17  0:35 [PATCH 1/4] drm: rename {drm_clflush_sg, drm_clflush_pages} Gurchetan Singh
2018-01-17  0:35 ` [PATCH 2/4] drm: add additional parameter in drm_flush_pages() and drm_flush_sg() Gurchetan Singh
     [not found] ` <20180117003559.67837-1-gurchetansingh-F7+t8E8rja9g9hUCZPvPmw@public.gmane.org>
2018-01-17  0:35   ` [PATCH 3/4] drm: add ARM flush implementations Gurchetan Singh
     [not found]     ` <20180117003559.67837-3-gurchetansingh-F7+t8E8rja9g9hUCZPvPmw@public.gmane.org>
2018-01-17  8:31       ` Daniel Vetter
     [not found]         ` <20180117083105.GG2759-dv86pmgwkMBes7Z6vYuT8azUEOm+Xw19@public.gmane.org>
2018-01-17 18:53           ` Sean Paul
2018-01-17 21:22             ` Daniel Vetter
2018-01-17 22:46         ` Gurchetan Singh
     [not found]           ` <CAAfnVBm4Mp8vC4aBmrP2rJeRSBN_AFN5gaZfijg9BCJEBupDzw-JsoAwUIsXosN+BqQ9rBEUg@public.gmane.org>
2018-01-18  7:42             ` Daniel Vetter
2018-01-18 17:20               ` Gurchetan Singh
2018-01-30  9:09                 ` Daniel Vetter
     [not found]               ` <CAKMK7uHoAqYsgqOTT4O=NQFvj51M2tKp+c2Sm8t4t01QN9Jkfg-JsoAwUIsXosN+BqQ9rBEUg@public.gmane.org>
2018-01-30 13:34                 ` Ville Syrjälä [this message]
2018-01-17  0:35   ` [PATCH 4/4] drm/vgem: flush page during page fault Gurchetan Singh
     [not found]     ` <20180117003559.67837-4-gurchetansingh-F7+t8E8rja9g9hUCZPvPmw@public.gmane.org>
2018-01-17  8:39       ` Daniel Vetter
2018-01-17 22:49         ` Gurchetan Singh
     [not found]           ` <CAAfnVBkxhEec1U8Ck4UyMXjwKvSFJs-bpip5K-7sbB51TNK0bA-JsoAwUIsXosN+BqQ9rBEUg@public.gmane.org>
2018-01-18  7:38             ` Daniel Vetter
2018-01-18 17:23               ` Gurchetan Singh
2018-01-30  9:14                 ` Daniel Vetter

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20180130133400.GY5453@intel.com \
    --to=ville.syrjala-vuqaysv1563yd54fqh9/ca@public.gmane.org \
    --cc=daniel-/w4YWyX8dFk@public.gmane.org \
    --cc=daniel.vetter-ral2JQCrhuEAvxtiuMwx3w@public.gmane.org \
    --cc=dri-devel-PD4FTy7X32lNgt0PjOBp9y5qC8QIuHrW@public.gmane.org \
    --cc=gurchetansingh-F7+t8E8rja9g9hUCZPvPmw@public.gmane.org \
    --cc=laurent.pinchart-ryLnwIuWjnjg/C1BVhZhaw@public.gmane.org \
    --cc=linux-tegra-u79uwXL29TY76Z2rM5mHXA@public.gmane.org \
    --cc=thierry.reding-Re5JQEeQqe8AvxtiuMwx3w@public.gmane.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).