From: Stanislaw Gruszka <stanislaw.gruszka@linux.intel.com>
To: Jeffrey Hugo <quic_jhugo@quicinc.com>
Cc: linux-doc@vger.kernel.org, linux-arm-msm@vger.kernel.org,
ogabbay@kernel.org, dri-devel@lists.freedesktop.org,
quic_ajitpals@quicinc.com, quic_pkanojiy@quicinc.com,
quic_carlv@quicinc.com, jacek.lawrynowicz@linux.intel.com
Subject: Re: [PATCH v2 5/8] accel/qaic: Add datapath
Date: Fri, 3 Mar 2023 14:49:32 +0100 [thread overview]
Message-ID: <20230303134932.GF3963532@linux.intel.com> (raw)
In-Reply-To: <5eed22fc-cd22-8186-de08-98827852a518@quicinc.com>
On Wed, Mar 01, 2023 at 11:14:35AM -0700, Jeffrey Hugo wrote:
> On 3/1/2023 10:05 AM, Stanislaw Gruszka wrote:
> > On Wed, Mar 01, 2023 at 09:08:03AM -0700, Jeffrey Hugo wrote:
> > > > This looks a bit suspicious. Are you sure you can modify
> > > > sg->dma_address and still use it as valid value ?
> > >
> > > A single entry in the sg table is a contiguous mapping of memory. If it
> > > wasn't contiguous, it would have to be broken up into multiple entries. In
> > > the simple case, a driver is going to take the dma_address/len pair and hand
> > > that directly to the device. Then the device is going to access every
> > > address in that range.
> > >
> > > If the device can access every address from dma_address to dma_address +
> > > len, why can't it access a subset of that?
> >
> > Required address alignment can be broken. Not sure if only that.
>
> AIC100 doesn't have required alignment. AIC100 can access any 64-bit
> address, at a byte level granularity. The only restriction AIC100 has is
> that the size of a transfer is restricted to a 32-bit value, so max
> individual transfer size of 4GB. Transferring more than 4GB requires
> multiple transactions.
>
> > > > > Are you suggesting renaming
> > > > > this function? I guess I'm not quite understanding your comment here. Can
> > > > > you elaborate?
> > > >
> > > > Renaming would be nice. I was thinking by simplifying it, not sure
> > > > now if that's easy achievable, though.
> > >
> > > Ok. I'll think on this.
> >
> > Maybe this function could be removed ? And create sg lists
> > that hardware can handle without any modification.
> > Just idea to consider, not any requirement.
>
> Ok, so this is part of our "slicing" operation, and thus required.
>
> Maybe how slicing works is not clear.
>
> Lets say that we have a workload on AIC100 that can identify a license plate
> in a picture (aka lprnet). Lets assume this workload only needs the RGB
> values of a RGBA file (a "jpeg" we are processing).
>
> Userspace allocates a BO to hold the entire file. A quarter of the file is
> R values, a quarter is G values, etc. For simplicity, lets assume the R
> values are all sequentially listed, then the G values, then the B values,
> finally the A values. When we allocate the BO, we map it once. If we have
> an IOMMU, this optimizes the IOMMU mappings. BOs can be quite large. We
> have some test workloads based on real world workloads where each BO is
> 16-32M in size, and there are multiple BOs. I don't want to map a 32M BO N
> duplicate times in the IOMMU.
>
> So, now userspace slices the BO. It tells us we need to transfer the RGB
> values (the first 75% of the BO), but not the A values. So, we create a
> copy of the mapped SG and edit it to represent this transfer, which is a
> subset of the entire BO. Using the slice information and the mapping
> information, we construct the DMA engine commands that can be used to
> transfer the relevant portions of the BO to the device.
>
> It sounds like you are suggesting, lets flip this around. Don't map the
> entire BO once. Instead, wait for the slice info from userspace, construct
> a sg list based on the parts of the BO for the slice, and map that. Then
> the driver gets a mapped SG it can just use. The issue I see with that is
> slices can overlap. You can transfer the same part of a BO multiple times.
> Maybe lprnet has multiple threads on AIC100 where thread A consumes R data,
> thread B consumes R and G data, and thread C consumes B data. We need to
> transfer the R data twice to different device locations so that threads A
> and B can consume the R data independently.
>
> If we map per slice, we are going to map the R part of the BO twice in the
> IOMMU. Is that valid? It feels possible that there exists some IOMMU
> implementation that won't allow multiple IOVAs to map to the same DDR PA
> because that is weird and the implementer thinks its a software bug. I
> don't want to run into that. Assuming it is valid, that is multiple
> mappings in the IOMMU TLB which could have been a single mapping. We are
> wasting IOMMU resources.
>
> There are some ARM systems we support with limited IOVA space in the IOMMU,
> and we've had some issues with exhausting that space. The current
> implementation is influenced by those experiences.
Ok, then the current implementation seems reasonable.
Thanks for explanation!
Regards
Stanislaw
next prev parent reply other threads:[~2023-03-03 13:49 UTC|newest]
Thread overview: 68+ messages / expand[flat|nested] mbox.gz Atom feed top
2023-02-06 15:41 [PATCH v2 0/8] QAIC accel driver Jeffrey Hugo
2023-02-06 15:41 ` Jeffrey Hugo
2023-02-06 15:41 ` [PATCH v2 1/8] accel/qaic: Add documentation for AIC100 accelerator driver Jeffrey Hugo
2023-02-06 15:41 ` Jeffrey Hugo
2023-02-14 11:08 ` Jacek Lawrynowicz
2023-02-15 15:41 ` Jeffrey Hugo
2023-02-15 15:41 ` Jeffrey Hugo
2023-02-16 14:18 ` Jacek Lawrynowicz
2023-02-16 15:20 ` Jeffrey Hugo
2023-02-16 15:20 ` Jeffrey Hugo
2023-02-06 15:41 ` [PATCH v2 2/8] accel/qaic: Add uapi and core driver file Jeffrey Hugo
2023-02-06 15:41 ` Jeffrey Hugo
2023-02-16 14:13 ` Jacek Lawrynowicz
2023-02-17 18:15 ` Jeffrey Hugo
2023-02-17 18:15 ` Jeffrey Hugo
2023-02-22 15:52 ` Dafna Hirschfeld
2023-02-22 15:52 ` Dafna Hirschfeld
2023-02-24 21:21 ` Jeffrey Hugo
2023-02-24 21:21 ` Jeffrey Hugo
2023-03-01 19:46 ` Jeffrey Hugo
2023-03-01 19:46 ` Jeffrey Hugo
2023-02-06 15:41 ` [PATCH v2 3/8] accel/qaic: Add MHI controller Jeffrey Hugo
2023-02-06 15:41 ` Jeffrey Hugo
2023-02-28 11:52 ` Stanislaw Gruszka
2023-02-28 11:52 ` Stanislaw Gruszka
2023-03-01 16:09 ` Jeffrey Hugo
2023-03-01 16:09 ` Jeffrey Hugo
2023-03-03 13:57 ` Stanislaw Gruszka
2023-02-06 15:41 ` [PATCH v2 4/8] accel/qaic: Add control path Jeffrey Hugo
2023-02-06 15:41 ` Jeffrey Hugo
2023-03-01 13:18 ` Stanislaw Gruszka
2023-03-01 13:18 ` Stanislaw Gruszka
2023-03-01 17:02 ` Jeffrey Hugo
2023-03-01 17:02 ` Jeffrey Hugo
2023-02-06 15:41 ` [PATCH v2 5/8] accel/qaic: Add datapath Jeffrey Hugo
2023-02-06 15:41 ` Jeffrey Hugo
2023-02-24 15:25 ` Stanislaw Gruszka
2023-02-24 15:25 ` Stanislaw Gruszka
2023-02-24 19:36 ` Jeffrey Hugo
2023-02-24 19:36 ` Jeffrey Hugo
2023-02-27 17:14 ` Stanislaw Gruszka
2023-02-27 17:14 ` Stanislaw Gruszka
2023-03-01 16:08 ` Jeffrey Hugo
2023-03-01 16:08 ` Jeffrey Hugo
2023-03-01 17:05 ` Stanislaw Gruszka
2023-03-01 18:14 ` Jeffrey Hugo
2023-03-01 18:14 ` Jeffrey Hugo
2023-03-03 13:49 ` Stanislaw Gruszka [this message]
2023-02-06 15:41 ` [PATCH v2 6/8] accel/qaic: Add mhi_qaic_cntl Jeffrey Hugo
2023-02-06 15:41 ` Jeffrey Hugo
2023-03-01 13:19 ` Stanislaw Gruszka
2023-03-01 13:19 ` Stanislaw Gruszka
2023-02-06 15:41 ` [PATCH v2 7/8] accel/qaic: Add qaic driver to the build system Jeffrey Hugo
2023-02-06 15:41 ` Jeffrey Hugo
2023-03-01 13:02 ` Stanislaw Gruszka
2023-03-01 13:02 ` Stanislaw Gruszka
2023-03-01 16:10 ` Jeffrey Hugo
2023-03-01 16:10 ` Jeffrey Hugo
2023-02-06 15:41 ` [PATCH v2 8/8] MAINTAINERS: Add entry for QAIC driver Jeffrey Hugo
2023-02-06 15:41 ` Jeffrey Hugo
2023-03-01 13:03 ` Stanislaw Gruszka
2023-03-01 13:03 ` Stanislaw Gruszka
2023-03-01 16:15 ` Jeffrey Hugo
2023-03-01 16:15 ` Jeffrey Hugo
2023-02-08 22:01 ` [PATCH v2 0/8] QAIC accel driver Jeffrey Hugo
2023-02-08 22:01 ` Jeffrey Hugo
2023-02-17 16:14 ` Jeffrey Hugo
2023-02-17 16:14 ` Jeffrey Hugo
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20230303134932.GF3963532@linux.intel.com \
--to=stanislaw.gruszka@linux.intel.com \
--cc=dri-devel@lists.freedesktop.org \
--cc=jacek.lawrynowicz@linux.intel.com \
--cc=linux-arm-msm@vger.kernel.org \
--cc=linux-doc@vger.kernel.org \
--cc=ogabbay@kernel.org \
--cc=quic_ajitpals@quicinc.com \
--cc=quic_carlv@quicinc.com \
--cc=quic_jhugo@quicinc.com \
--cc=quic_pkanojiy@quicinc.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.