All of lore.kernel.org
 help / color / mirror / Atom feed
From: vinod.koul@intel.com (Vinod Koul)
To: linux-arm-kernel@lists.infradead.org
Subject: [linux-sunxi] Re: [PATCH] dma: sun4i: expose block size and wait cycle configuration to DMA users
Date: Tue, 8 Mar 2016 15:35:38 +0530	[thread overview]
Message-ID: <20160308100538.GO11154@localhost> (raw)
In-Reply-To: <56DE9077.3020905@redhat.com>

On Tue, Mar 08, 2016 at 09:42:31AM +0100, Hans de Goede wrote:
> <wild speculation>
> 
> I see 2 possible reasons why waiting till checking for drq can help:
> 
> 1) A lot of devices have an internal fifo hooked up to a single mmio data
> register which gets read using the general purpose dma-engine, it allows
> this fifo to fill, and thus do burst transfers
> (We've seen similar issues with the scanout engine for the display which
>  has its own dma engine, and doing larger transfers helps a lot).
> 
> 2) Physical memory on the sunxi SoCs is (often) divided into banks
> with a shared data / address bus doing bank-switches is expensive, so
> this wait cycles may introduce latency which allows a user of another
> bank to complete its RAM accesses before the dma engine forces a
> bank switch, which ends up avoiding a lot of (interleaved) bank switches
> while both try to access a different banj and thus waiting makes things
> (much) faster in the end (again a known problem with the display
> scanout engine).
> 
> </wild speculation>
> 
> Note the differences these kinda tweaks make can be quite dramatic,
> when using a 1920x1080p60 hdmi output on the A10 SoC with a 16 bit
> memory bus (real world worst case scenario), the memory bandwidth
> left for userspace processes (measured through memset) almost doubles
> from 48 MB/s to 85 MB/s, source:
> http://ssvb.github.io/2014/11/11/revisiting-fullhd-x11-desktop-performance-of-the-allwinner-a10.html
> 
> TL;DR: Waiting before starting DMA allows for doing larger burst
> transfers which ends up making things more efficient.
> 
> Given this, I really expect there to be other dma-engines which
> have some option to wait a bit before starting/unpausing a transfer
> instead of starting it as soon as (more) data is available, so I think
> this would make a good addition to dma_slave_config.

I tend to agree but before we do that I would like this hypothesis to be
confirmed :)

-- 
~Vinod

WARNING: multiple messages have this Message-ID (diff)
From: Vinod Koul <vinod.koul@intel.com>
To: Hans de Goede <hdegoede@redhat.com>
Cc: maxime.ripard@free-electrons.com,
	"Boris Brezillon" <boris.brezillon@free-electrons.com>,
	"Dan Williams" <dan.j.williams@intel.com>,
	dmaengine@vger.kernel.org, "Chen-Yu Tsai" <wens@csie.org>,
	linux-sunxi@googlegroups.com,
	"Emilio López" <emilio@elopez.com.ar>,
	linux-arm-kernel@lists.infradead.org,
	linux-kernel@vger.kernel.org
Subject: Re: [linux-sunxi] Re: [PATCH] dma: sun4i: expose block size and wait cycle configuration to DMA users
Date: Tue, 8 Mar 2016 15:35:38 +0530	[thread overview]
Message-ID: <20160308100538.GO11154@localhost> (raw)
In-Reply-To: <56DE9077.3020905@redhat.com>

On Tue, Mar 08, 2016 at 09:42:31AM +0100, Hans de Goede wrote:
> <wild speculation>
> 
> I see 2 possible reasons why waiting till checking for drq can help:
> 
> 1) A lot of devices have an internal fifo hooked up to a single mmio data
> register which gets read using the general purpose dma-engine, it allows
> this fifo to fill, and thus do burst transfers
> (We've seen similar issues with the scanout engine for the display which
>  has its own dma engine, and doing larger transfers helps a lot).
> 
> 2) Physical memory on the sunxi SoCs is (often) divided into banks
> with a shared data / address bus doing bank-switches is expensive, so
> this wait cycles may introduce latency which allows a user of another
> bank to complete its RAM accesses before the dma engine forces a
> bank switch, which ends up avoiding a lot of (interleaved) bank switches
> while both try to access a different banj and thus waiting makes things
> (much) faster in the end (again a known problem with the display
> scanout engine).
> 
> </wild speculation>
> 
> Note the differences these kinda tweaks make can be quite dramatic,
> when using a 1920x1080p60 hdmi output on the A10 SoC with a 16 bit
> memory bus (real world worst case scenario), the memory bandwidth
> left for userspace processes (measured through memset) almost doubles
> from 48 MB/s to 85 MB/s, source:
> http://ssvb.github.io/2014/11/11/revisiting-fullhd-x11-desktop-performance-of-the-allwinner-a10.html
> 
> TL;DR: Waiting before starting DMA allows for doing larger burst
> transfers which ends up making things more efficient.
> 
> Given this, I really expect there to be other dma-engines which
> have some option to wait a bit before starting/unpausing a transfer
> instead of starting it as soon as (more) data is available, so I think
> this would make a good addition to dma_slave_config.

I tend to agree but before we do that I would like this hypothesis to be
confirmed :)

-- 
~Vinod

  reply	other threads:[~2016-03-08 10:05 UTC|newest]

Thread overview: 64+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2016-03-07  9:59 [PATCH] dma: sun4i: expose block size and wait cycle configuration to DMA users Boris Brezillon
2016-03-07  9:59 ` Boris Brezillon
2016-03-07 14:54 ` Vinod Koul
2016-03-07 14:54   ` Vinod Koul
2016-03-07 15:08   ` Boris Brezillon
2016-03-07 15:08     ` Boris Brezillon
2016-03-07 20:30     ` Maxime Ripard
2016-03-07 20:30       ` Maxime Ripard
2016-03-08  2:55       ` Vinod Koul
2016-03-08  2:55         ` Vinod Koul
2016-03-08  2:59         ` Vinod Koul
2016-03-08  2:59           ` Vinod Koul
2016-03-08  7:51         ` Maxime Ripard
2016-03-08  7:51           ` Maxime Ripard
2016-03-08  8:42           ` [linux-sunxi] " Hans de Goede
2016-03-08  8:42             ` Hans de Goede
2016-03-08 10:05             ` Vinod Koul [this message]
2016-03-08 10:05               ` Vinod Koul
2016-03-09 10:58               ` Maxime Ripard
2016-03-09 10:58                 ` Maxime Ripard
2016-03-08  8:46           ` Boris Brezillon
2016-03-08  8:46             ` Boris Brezillon
2016-03-08  9:10             ` [linux-sunxi] " Priit Laes
2016-03-08  9:10               ` Priit Laes
2016-03-08 10:04             ` Vinod Koul
2016-03-08 10:04               ` Vinod Koul
2016-03-09 10:08             ` [linux-sunxi] " LABBE Corentin
2016-03-09 10:08               ` LABBE Corentin
2016-03-08  9:59           ` Vinod Koul
2016-03-08  9:59             ` Vinod Koul
2016-03-09 10:14         ` Boris Brezillon
2016-03-09 10:14           ` Boris Brezillon
2016-03-11  6:24           ` Vinod Koul
2016-03-11  6:24             ` Vinod Koul
2016-03-11  9:40             ` Boris Brezillon
2016-03-11  9:40               ` Boris Brezillon
2016-03-11 10:06               ` Vinod Koul
2016-03-11 10:06                 ` Vinod Koul
2016-03-11 10:26                 ` Boris Brezillon
2016-03-11 10:26                   ` Boris Brezillon
2016-03-11 11:21                   ` Vinod Koul
2016-03-11 11:21                     ` Vinod Koul
2016-03-09 11:06         ` Boris Brezillon
2016-03-09 11:06           ` Boris Brezillon
2016-03-11  6:26           ` Vinod Koul
2016-03-11  6:26             ` Vinod Koul
2016-03-11  9:45             ` Boris Brezillon
2016-03-11  9:45               ` Boris Brezillon
2016-03-11 10:09               ` Vinod Koul
2016-03-11 10:09                 ` Vinod Koul
2016-03-11 10:55                 ` Maxime Ripard
2016-03-11 10:55                   ` Maxime Ripard
2016-03-11 11:18                   ` Vinod Koul
2016-03-11 11:18                     ` Vinod Koul
2016-03-14 11:46                     ` Maxime Ripard
2016-03-14 11:46                       ` Maxime Ripard
2016-03-16  3:22                       ` Vinod Koul
2016-03-16  3:22                         ` Vinod Koul
2016-03-07 15:30 ` [linux-sunxi] " Priit Laes
2016-03-07 15:30   ` Priit Laes
2016-03-07 15:47   ` Boris Brezillon
2016-03-07 15:47     ` Boris Brezillon
2016-03-07 17:15     ` Emilio López
2016-03-07 17:15       ` Emilio López

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20160308100538.GO11154@localhost \
    --to=vinod.koul@intel.com \
    --cc=linux-arm-kernel@lists.infradead.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.