linux-doc.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: Stanislaw Gruszka <stanislaw.gruszka@linux.intel.com>
To: Jeffrey Hugo <quic_jhugo@quicinc.com>
Cc: linux-doc@vger.kernel.org, linux-arm-msm@vger.kernel.org,
	ogabbay@kernel.org, dri-devel@lists.freedesktop.org,
	quic_ajitpals@quicinc.com, quic_pkanojiy@quicinc.com,
	quic_carlv@quicinc.com, jacek.lawrynowicz@linux.intel.com
Subject: Re: [PATCH v2 5/8] accel/qaic: Add datapath
Date: Fri, 3 Mar 2023 14:49:32 +0100	[thread overview]
Message-ID: <20230303134932.GF3963532@linux.intel.com> (raw)
In-Reply-To: <5eed22fc-cd22-8186-de08-98827852a518@quicinc.com>

On Wed, Mar 01, 2023 at 11:14:35AM -0700, Jeffrey Hugo wrote:
> On 3/1/2023 10:05 AM, Stanislaw Gruszka wrote:
> > On Wed, Mar 01, 2023 at 09:08:03AM -0700, Jeffrey Hugo wrote:
> > > > This looks a bit suspicious. Are you sure you can modify
> > > > sg->dma_address and still use it as valid value ?
> > > 
> > > A single entry in the sg table is a contiguous mapping of memory.  If it
> > > wasn't contiguous, it would have to be broken up into multiple entries.  In
> > > the simple case, a driver is going to take the dma_address/len pair and hand
> > > that directly to the device.  Then the device is going to access every
> > > address in that range.
> > > 
> > > If the device can access every address from dma_address to dma_address +
> > > len, why can't it access a subset of that?
> > 
> > Required address alignment can be broken. Not sure if only that.
> 
> AIC100 doesn't have required alignment.  AIC100 can access any 64-bit
> address, at a byte level granularity.  The only restriction AIC100 has is
> that the size of a transfer is restricted to a 32-bit value, so max
> individual transfer size of 4GB.  Transferring more than 4GB requires
> multiple transactions.
> 
> > > > > Are you suggesting renaming
> > > > > this function?  I guess I'm not quite understanding your comment here. Can
> > > > > you elaborate?
> > > > 
> > > > Renaming would be nice. I was thinking by simplifying it, not sure
> > > > now if that's easy achievable, though.
> > > 
> > > Ok.  I'll think on this.
> > 
> > Maybe this function could be removed ? And create sg lists
> > that hardware can handle without any modification.
> > Just idea to consider, not any requirement.
> 
> Ok, so this is part of our "slicing" operation, and thus required.
> 
> Maybe how slicing works is not clear.
> 
> Lets say that we have a workload on AIC100 that can identify a license plate
> in a picture (aka lprnet).  Lets assume this workload only needs the RGB
> values of a RGBA file (a "jpeg" we are processing).
> 
> Userspace allocates a BO to hold the entire file.  A quarter of the file is
> R values, a quarter is G values, etc.  For simplicity, lets assume the R
> values are all sequentially listed, then the G values, then the B values,
> finally the A values.  When we allocate the BO, we map it once.  If we have
> an IOMMU, this optimizes the IOMMU mappings.  BOs can be quite large.  We
> have some test workloads based on real world workloads where each BO is
> 16-32M in size, and there are multiple BOs.  I don't want to map a 32M BO N
> duplicate times in the IOMMU.
> 
> So, now userspace slices the BO.  It tells us we need to transfer the RGB
> values (the first 75% of the BO), but not the A values.  So, we create a
> copy of the mapped SG and edit it to represent this transfer, which is a
> subset of the entire BO.  Using the slice information and the mapping
> information, we construct the DMA engine commands that can be used to
> transfer the relevant portions of the BO to the device.
> 
> It sounds like you are suggesting, lets flip this around.  Don't map the
> entire BO once.  Instead, wait for the slice info from userspace, construct
> a sg list based on the parts of the BO for the slice, and map that.  Then
> the driver gets a mapped SG it can just use.  The issue I see with that is
> slices can overlap.  You can transfer the same part of a BO multiple times.
> Maybe lprnet has multiple threads on AIC100 where thread A consumes R data,
> thread B consumes R and G data, and thread C consumes B data.  We need to
> transfer the R data twice to different device locations so that threads A
> and B can consume the R data independently.
> 
> If we map per slice, we are going to map the R part of the BO twice in the
> IOMMU.  Is that valid?  It feels possible that there exists some IOMMU
> implementation that won't allow multiple IOVAs to map to the same DDR PA
> because that is weird and the implementer thinks its a software bug.  I
> don't want to run into that.  Assuming it is valid, that is multiple
> mappings in the IOMMU TLB which could have been a single mapping.  We are
> wasting IOMMU resources.
> 
> There are some ARM systems we support with limited IOVA space in the IOMMU,
> and we've had some issues with exhausting that space.  The current
> implementation is influenced by those experiences.

Ok, then the current implementation seems reasonable.
Thanks for explanation!

Regards
Stanislaw

  reply	other threads:[~2023-03-03 13:49 UTC|newest]

Thread overview: 37+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2023-02-06 15:41 [PATCH v2 0/8] QAIC accel driver Jeffrey Hugo
2023-02-06 15:41 ` [PATCH v2 1/8] accel/qaic: Add documentation for AIC100 accelerator driver Jeffrey Hugo
2023-02-14 11:08   ` Jacek Lawrynowicz
2023-02-15 15:41     ` Jeffrey Hugo
2023-02-16 14:18       ` Jacek Lawrynowicz
2023-02-16 15:20         ` Jeffrey Hugo
2023-02-06 15:41 ` [PATCH v2 2/8] accel/qaic: Add uapi and core driver file Jeffrey Hugo
2023-02-16 14:13   ` Jacek Lawrynowicz
2023-02-17 18:15     ` Jeffrey Hugo
2023-02-22 15:52       ` Dafna Hirschfeld
2023-02-24 21:21         ` Jeffrey Hugo
2023-03-01 19:46       ` Jeffrey Hugo
2023-02-06 15:41 ` [PATCH v2 3/8] accel/qaic: Add MHI controller Jeffrey Hugo
2023-02-28 11:52   ` Stanislaw Gruszka
2023-03-01 16:09     ` Jeffrey Hugo
2023-03-03 13:57       ` Stanislaw Gruszka
2023-02-06 15:41 ` [PATCH v2 4/8] accel/qaic: Add control path Jeffrey Hugo
2023-03-01 13:18   ` Stanislaw Gruszka
2023-03-01 17:02     ` Jeffrey Hugo
2023-02-06 15:41 ` [PATCH v2 5/8] accel/qaic: Add datapath Jeffrey Hugo
2023-02-24 15:25   ` Stanislaw Gruszka
2023-02-24 19:36     ` Jeffrey Hugo
2023-02-27 17:14       ` Stanislaw Gruszka
2023-03-01 16:08         ` Jeffrey Hugo
2023-03-01 17:05           ` Stanislaw Gruszka
2023-03-01 18:14             ` Jeffrey Hugo
2023-03-03 13:49               ` Stanislaw Gruszka [this message]
2023-02-06 15:41 ` [PATCH v2 6/8] accel/qaic: Add mhi_qaic_cntl Jeffrey Hugo
2023-03-01 13:19   ` Stanislaw Gruszka
2023-02-06 15:41 ` [PATCH v2 7/8] accel/qaic: Add qaic driver to the build system Jeffrey Hugo
2023-03-01 13:02   ` Stanislaw Gruszka
2023-03-01 16:10     ` Jeffrey Hugo
2023-02-06 15:41 ` [PATCH v2 8/8] MAINTAINERS: Add entry for QAIC driver Jeffrey Hugo
2023-03-01 13:03   ` Stanislaw Gruszka
2023-03-01 16:15     ` Jeffrey Hugo
2023-02-08 22:01 ` [PATCH v2 0/8] QAIC accel driver Jeffrey Hugo
2023-02-17 16:14   ` Jeffrey Hugo

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20230303134932.GF3963532@linux.intel.com \
    --to=stanislaw.gruszka@linux.intel.com \
    --cc=dri-devel@lists.freedesktop.org \
    --cc=jacek.lawrynowicz@linux.intel.com \
    --cc=linux-arm-msm@vger.kernel.org \
    --cc=linux-doc@vger.kernel.org \
    --cc=ogabbay@kernel.org \
    --cc=quic_ajitpals@quicinc.com \
    --cc=quic_carlv@quicinc.com \
    --cc=quic_jhugo@quicinc.com \
    --cc=quic_pkanojiy@quicinc.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).