public inbox for linux-tegra@vger.kernel.org
 help / color / mirror / Atom feed
From: Thierry Reding <thierry.reding@gmail.com>
To: Thierry Reding <thierry.reding@gmail.com>
Cc: linux-tegra@vger.kernel.org, Dmitry Osipenko <digetx@gmail.com>,
	dri-devel@lists.freedesktop.org,
	Mikko Perttunen <mperttunen@nvidia.com>
Subject: [PATCH v3 03/16] gpu: host1x: Introduce support for wide opcodes
Date: Fri,  1 Feb 2019 14:28:24 +0100	[thread overview]
Message-ID: <20190201132837.12327-4-thierry.reding@gmail.com> (raw)
In-Reply-To: <20190201132837.12327-1-thierry.reding@gmail.com>

From: Thierry Reding <treding@nvidia.com>

The CDMA push buffer can currently only handle opcodes that take a
single word parameter. However, the host1x implementation on Tegra186
and later supports opcodes that require multiple words as parameters.

Unfortunately the way the push buffer is structured, these wide opcodes
cannot simply be composed of two regular opcodes because that could
result in the wide opcode being split across the end of the push buffer
and the final RESTART opcode required to wrap the push buffer around
would break the wide opcode.

One way to fix this would be to remove the concept of slots to simplify
push buffer operations. However, that's not entirely trivial and should
be done in a separate patch. For now, simply use a different function
to push four-word opcodes into the push buffer. Technically only three
words are pushed, with the fourth word used as padding to preserve the
2-word alignment required by the slots abstraction. The fourth word is
always a NOP opcode.

Additional care must be taken when the end of the push buffer is
reached. If a four-word opcode doesn't fit into the push buffer without
being split by the boundary, NOP opcodes will be introduced and the new
wide opcode placed at the beginning of the push buffer.

Signed-off-by: Thierry Reding <treding@nvidia.com>
---
 drivers/gpu/host1x/cdma.c     | 92 +++++++++++++++++++++++++++++++++++
 drivers/gpu/host1x/cdma.h     |  2 +
 include/trace/events/host1x.h | 26 ++++++++++
 3 files changed, 120 insertions(+)

diff --git a/drivers/gpu/host1x/cdma.c b/drivers/gpu/host1x/cdma.c
index 91df51e631b2..e2d106fa3c0b 100644
--- a/drivers/gpu/host1x/cdma.c
+++ b/drivers/gpu/host1x/cdma.c
@@ -217,6 +217,45 @@ unsigned int host1x_cdma_wait_locked(struct host1x_cdma *cdma,
 	return 0;
 }
 
+/*
+ * Sleep (if necessary) until the push buffer has enough free space.
+ *
+ * Must be called with the cdma lock held.
+ */
+int host1x_cdma_wait_pushbuffer_space(struct host1x *host1x,
+				      struct host1x_cdma *cdma,
+				      unsigned int needed)
+{
+	while (true) {
+		struct push_buffer *pb = &cdma->push_buffer;
+		unsigned int space;
+
+		space = host1x_pushbuffer_space(pb);
+		if (space >= needed)
+			break;
+
+		trace_host1x_wait_cdma(dev_name(cdma_to_channel(cdma)->dev),
+				       CDMA_EVENT_PUSH_BUFFER_SPACE);
+
+		host1x_hw_cdma_flush(host1x, cdma);
+
+		/* If somebody has managed to already start waiting, yield */
+		if (cdma->event != CDMA_EVENT_NONE) {
+			mutex_unlock(&cdma->lock);
+			schedule();
+			mutex_lock(&cdma->lock);
+			continue;
+		}
+
+		cdma->event = CDMA_EVENT_PUSH_BUFFER_SPACE;
+
+		mutex_unlock(&cdma->lock);
+		down(&cdma->sem);
+		mutex_lock(&cdma->lock);
+	}
+
+	return 0;
+}
 /*
  * Start timer that tracks the time spent by the job.
  * Must be called with the cdma lock held.
@@ -509,6 +548,59 @@ void host1x_cdma_push(struct host1x_cdma *cdma, u32 op1, u32 op2)
 	host1x_pushbuffer_push(pb, op1, op2);
 }
 
+/*
+ * Push four words into two consecutive push buffer slots. Note that extra
+ * care needs to be taken not to split the two slots across the end of the
+ * push buffer. Otherwise the RESTART opcode at the end of the push buffer
+ * that ensures processing will restart at the beginning will break up the
+ * four words.
+ *
+ * Blocks as necessary if the push buffer is full.
+ */
+void host1x_cdma_push_wide(struct host1x_cdma *cdma, u32 op1, u32 op2,
+			   u32 op3, u32 op4)
+{
+	struct host1x_channel *channel = cdma_to_channel(cdma);
+	struct host1x *host1x = cdma_to_host1x(cdma);
+	struct push_buffer *pb = &cdma->push_buffer;
+	unsigned int needed = 2, extra = 0, i;
+	unsigned int space = cdma->slots_free;
+
+	if (host1x_debug_trace_cmdbuf)
+		trace_host1x_cdma_push_wide(dev_name(channel->dev), op1, op2,
+					    op3, op4);
+
+	/* compute number of extra slots needed for padding */
+	if (pb->pos + 16 > pb->size) {
+		extra = (pb->size - pb->pos) / 8;
+		needed += extra;
+	}
+
+	host1x_cdma_wait_pushbuffer_space(host1x, cdma, needed);
+	space = host1x_pushbuffer_space(pb);
+
+	cdma->slots_free = space - needed;
+	cdma->slots_used += needed;
+
+	/*
+	 * Note that we rely on the fact that this is only used to submit wide
+	 * gather opcodes, which consist of 3 words, and they are padded with
+	 * a NOP to avoid having to deal with fractional slots (a slot always
+	 * represents 2 words). The fourth opcode passed to this function will
+	 * therefore always be a NOP.
+	 *
+	 * This works around a slight ambiguity when it comes to opcodes. For
+	 * all current host1x incarnations the NOP opcode uses the exact same
+	 * encoding (0x20000000), so we could hard-code the value here, but a
+	 * new incarnation may change it and break that assumption.
+	 */
+	for (i = 0; i < extra; i++)
+		host1x_pushbuffer_push(pb, op4, op4);
+
+	host1x_pushbuffer_push(pb, op1, op2);
+	host1x_pushbuffer_push(pb, op3, op4);
+}
+
 /*
  * End a cdma submit
  * Kick off DMA, add job to the sync queue, and a number of slots to be freed
diff --git a/drivers/gpu/host1x/cdma.h b/drivers/gpu/host1x/cdma.h
index e97e17b82370..ebf6c6a06541 100644
--- a/drivers/gpu/host1x/cdma.h
+++ b/drivers/gpu/host1x/cdma.h
@@ -90,6 +90,8 @@ int host1x_cdma_init(struct host1x_cdma *cdma);
 int host1x_cdma_deinit(struct host1x_cdma *cdma);
 int host1x_cdma_begin(struct host1x_cdma *cdma, struct host1x_job *job);
 void host1x_cdma_push(struct host1x_cdma *cdma, u32 op1, u32 op2);
+void host1x_cdma_push_wide(struct host1x_cdma *cdma, u32 op1, u32 op2,
+			   u32 op3, u32 op4);
 void host1x_cdma_end(struct host1x_cdma *cdma, struct host1x_job *job);
 void host1x_cdma_update(struct host1x_cdma *cdma);
 void host1x_cdma_peek(struct host1x_cdma *cdma, u32 dmaget, int slot,
diff --git a/include/trace/events/host1x.h b/include/trace/events/host1x.h
index a37ef73092e5..3d340b6f1ea3 100644
--- a/include/trace/events/host1x.h
+++ b/include/trace/events/host1x.h
@@ -80,6 +80,32 @@ TRACE_EVENT(host1x_cdma_push,
 		__entry->name, __entry->op1, __entry->op2)
 );
 
+TRACE_EVENT(host1x_cdma_push_wide,
+	TP_PROTO(const char *name, u32 op1, u32 op2, u32 op3, u32 op4),
+
+	TP_ARGS(name, op1, op2, op3, op4),
+
+	TP_STRUCT__entry(
+		__field(const char *, name)
+		__field(u32, op1)
+		__field(u32, op2)
+		__field(u32, op3)
+		__field(u32, op4)
+	),
+
+	TP_fast_assign(
+		__entry->name = name;
+		__entry->op1 = op1;
+		__entry->op2 = op2;
+		__entry->op3 = op3;
+		__entry->op4 = op4;
+	),
+
+	TP_printk("name=%s, op1=%08x, op2=%08x, op3=%08x op4=%08x",
+		__entry->name, __entry->op1, __entry->op2, __entry->op3,
+		__entry->op4)
+);
+
 TRACE_EVENT(host1x_cdma_push_gather,
 	TP_PROTO(const char *name, struct host1x_bo *bo,
 			u32 words, u32 offset, void *cmdbuf),
-- 
2.19.1

_______________________________________________
dri-devel mailing list
dri-devel@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/dri-devel

  parent reply	other threads:[~2019-02-01 13:28 UTC|newest]

Thread overview: 30+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2019-02-01 13:28 [PATCH v3 00/16] drm/tegra: Fix IOVA space on Tegra186 and later Thierry Reding
2019-02-01 13:28 ` [PATCH v3 01/16] gpu: host1x: Set up stream ID table Thierry Reding
2019-02-01 13:28 ` [PATCH v3 02/16] gpu: host1x: Program the channel stream ID Thierry Reding
2019-02-01 13:28 ` Thierry Reding [this message]
2019-02-01 13:28 ` [PATCH v3 04/16] gpu: host1x: Support 40-bit addressing Thierry Reding
2019-02-01 13:28 ` [PATCH v3 05/16] gpu: host1x: Use direct DMA with IOMMU API usage Thierry Reding
2019-02-01 14:46   ` Dmitry Osipenko
2019-02-01 13:28 ` [PATCH v3 06/16] gpu: host1x: Restrict IOVA space to DMA mask Thierry Reding
2019-02-01 14:47   ` Dmitry Osipenko
2019-02-01 13:28 ` [PATCH v3 07/16] gpu: host1x: Support 40-bit addressing on Tegra186 Thierry Reding
2019-02-01 13:28 ` [PATCH v3 08/16] gpu: host1x: Use correct semantics for HOST1X_CHANNEL_DMAEND Thierry Reding
2019-02-01 13:37   ` Dmitry Osipenko
2019-02-01 13:41     ` Thierry Reding
2019-02-01 13:48       ` Dmitry Osipenko
2019-02-01 14:10         ` Thierry Reding
2019-02-01 14:40           ` Dmitry Osipenko
2019-02-01 14:47   ` Dmitry Osipenko
2019-02-01 13:28 ` [PATCH v3 09/16] gpu: host1x: Optimize CDMA push buffer memory usage Thierry Reding
2019-02-01 14:48   ` Dmitry Osipenko
2019-02-01 13:28 ` [PATCH v3 10/16] drm/tegra: Store parent pointer in Tegra DRM clients Thierry Reding
2019-02-01 14:48   ` Dmitry Osipenko
2019-02-01 13:28 ` [PATCH v3 11/16] drm/tegra: vic: Load firmware on demand Thierry Reding
2019-02-01 13:28 ` [PATCH v3 12/16] drm/tegra: Setup shared IOMMU domain after initialization Thierry Reding
2019-02-01 14:48   ` Dmitry Osipenko
2019-02-01 13:28 ` [PATCH v3 13/16] drm/tegra: Restrict IOVA space to DMA mask Thierry Reding
2019-02-01 14:49   ` Dmitry Osipenko
2019-02-01 13:28 ` [PATCH v3 14/16] drm/tegra: vic: Do not clear driver data Thierry Reding
2019-02-01 13:28 ` [PATCH v3 15/16] drm/tegra: vic: Support stream ID register programming Thierry Reding
2019-02-01 13:28 ` [PATCH v3 16/16] arm64: tegra: Enable SMMU for VIC on Tegra186 Thierry Reding
2019-02-01 14:46 ` [PATCH v3 00/16] drm/tegra: Fix IOVA space on Tegra186 and later Dmitry Osipenko

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20190201132837.12327-4-thierry.reding@gmail.com \
    --to=thierry.reding@gmail.com \
    --cc=digetx@gmail.com \
    --cc=dri-devel@lists.freedesktop.org \
    --cc=linux-tegra@vger.kernel.org \
    --cc=mperttunen@nvidia.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox