[PATCH v2 0/2] mailbox: add async request mechanism w/ a user

linux-arm-kernel.lists.infradead.org archive mirror
 help / color / mirror / Atom feed

* [PATCH v2 0/2] mailbox: add async request mechanism w/ a user
@ 2024-10-17 16:36 Tudor Ambarus
  2024-10-17 16:36 ` [PATCH v2 1/2] mailbox: add async request mechanism to empower controllers w/ hw queues Tudor Ambarus
  2024-10-17 16:36 ` [PATCH v2 2/2] firmware: add exynos acpm driver Tudor Ambarus
  0 siblings, 2 replies; 25+ messages in thread
From: Tudor Ambarus @ 2024-10-17 16:36 UTC (permalink / raw)
  To: jassisinghbrar, krzk
  Cc: alim.akhtar, mst, javierm, tzimmermann, bartosz.golaszewski,
	luzmaximilian, sudeep.holla, conor.dooley, bjorn, ulf.hansson,
	linux-samsung-soc, linux-kernel, linux-arm-kernel, marcan, neal,
	alyssa, broonie, andre.draszik, willmcvicker, peter.griffin,
	kernel-team, vincent.guittot, daniel.lezcano, Tudor Ambarus

Hi,

This adds a simple mailbox async mechanism, similar to the one found in
the crypto subsystem. It empowers mailbox controllers with hardware
queue support.

It then adds a user for this mechanism, the ACPM driver.
ACPM (Alive Clock and Power Manager) is a firmware that operates on the
APM (Active Power Management) module that handles overall power management
activities. ACPM and masters communicate with each other using mailbox
messages and shared memory (mmio-sram). The shared memory contains
channel configuration data. It exposes at a specific offset into the
memory the channel ID, message and queue lengths, pointers to the TX and
RX queues (which are also part of the shared memory), and whether the RX
queues work by polling or interrupts. It resembles in a way to the
arm-scmi driver as that too uses mailbox messages and shared memory to
communicate with the firmware.

The set is marked as v2 because the mailbox core patch was already sent
for review a few weeks ago:
Link: https://lore.kernel.org/linux-arm-kernel/20241004165301.1979527-1-tudor.ambarus@linaro.org/

If everyone is happy with the current form of the set, we'll probably
need an immutable tag/branch to be shared between the mailbox and firmware
trees.

Thanks,
ta

Changes in v2:
- add the exynos acpm driver - new patch.
- extend the mailbox request with rx and tx len - let the client decide
  how much to write and get back from the controller. The controller can
  verify the lengths by comparing them with its channel length.
- extend the mailbox request with flags, in particular with
  MBOX_REQ_MAY_SLEEP. All requests that don't set this flag are
  considered in atomic context.
- remove a dereference that was done before checking for null.
- update the commit message, rebased on top of v6.12-rc3.


Tudor Ambarus (2):
  mailbox: add async request mechanism to empower controllers w/ hw
    queues
  firmware: add exynos acpm driver

 drivers/firmware/Kconfig                    |   1 +
 drivers/firmware/Makefile                   |   1 +
 drivers/firmware/samsung/Kconfig            |  11 +
 drivers/firmware/samsung/Makefile           |   3 +
 drivers/firmware/samsung/exynos-acpm.c      | 703 ++++++++++++++++++++
 drivers/mailbox/mailbox.c                   | 127 +++-
 include/linux/mailbox/exynos-acpm-message.h |  21 +
 include/linux/mailbox_client.h              |   4 +
 include/linux/mailbox_controller.h          |   7 +
 include/linux/mailbox_request.h             |  33 +
 10 files changed, 888 insertions(+), 23 deletions(-)
 create mode 100644 drivers/firmware/samsung/Kconfig
 create mode 100644 drivers/firmware/samsung/Makefile
 create mode 100644 drivers/firmware/samsung/exynos-acpm.c
 create mode 100644 include/linux/mailbox/exynos-acpm-message.h
 create mode 100644 include/linux/mailbox_request.h

-- 
2.47.0.rc1.288.g06298d1525-goog



^ permalink raw reply	[flat|nested] 25+ messages in thread

* [PATCH v2 1/2] mailbox: add async request mechanism to empower controllers w/ hw queues
  2024-10-17 16:36 [PATCH v2 0/2] mailbox: add async request mechanism w/ a user Tudor Ambarus
@ 2024-10-17 16:36 ` Tudor Ambarus
  2024-10-18  4:17   ` Jassi Brar
  2024-10-17 16:36 ` [PATCH v2 2/2] firmware: add exynos acpm driver Tudor Ambarus
  1 sibling, 1 reply; 25+ messages in thread
From: Tudor Ambarus @ 2024-10-17 16:36 UTC (permalink / raw)
  To: jassisinghbrar, krzk
  Cc: alim.akhtar, mst, javierm, tzimmermann, bartosz.golaszewski,
	luzmaximilian, sudeep.holla, conor.dooley, bjorn, ulf.hansson,
	linux-samsung-soc, linux-kernel, linux-arm-kernel, marcan, neal,
	alyssa, broonie, andre.draszik, willmcvicker, peter.griffin,
	kernel-team, vincent.guittot, daniel.lezcano, Tudor Ambarus

Current form of the mailbox framework doesn't allow controllers to benefit
of their hardware queue capabilities as the framework handles a single
active request at a time.

The active request is considered completed when TX completes. But it seems
that TX is not in direct relation with RX, so a client can't know to which
TX data the RX data corresponds to. Let's consider a client sends
TX1 data, mbox->ops->send_data() is called, timer is started immediately,
last_tx_done() returns true and the client is notified that TX1 completed.
Client sends TX2 and gets notified that TX2 completed. RX comes,
mbox_chan_received_data(chan, mssg) is called after both TX1 and TX2
completed. Client can't know if the received mssg belongs to TX1 or TX2.

In order to address these shortcomings, add a simple async mechanism
based on requests. A request will contain pointers to tx and rx (if any)
data, along with a pointer to a completion struct. Is the responsibility
of the client to allocate and fill the request:

static int client_send_request(struct demo_client *dc, void *data,
			       size_t data_len)
{
	DECLARE_MBOX_WAIT(wait);
	struct mbox_request req;
	int ret;

	req.wait = &wait;
	req.tx = data;
	req.txlen = data_len;
	/*
	 * Set req.rx = NULL if no response is expected. Here we
	 * use the same memory to get the response.
	 */
	req.rx = data;
	req.rxlen = data_len;

	ret = mbox_send_request(dc->mbox_chan, &req);
	ret = mbox_wait_request(ret, &wait);
	if (ret)
		dev_err(dc->dev, "%s failed %d\n", __func__, ret);
	return ret;
}

mbox_send_request() sends a message on the bus. The call may be in atomic
context. Returns -EINPROGRESS if data is fed into hardware, -ENOSPC when
the hardware queue is full, or zero when the request completes.
The message (queue) handling is thus deferred to the controller.

Similar mechanism is used in the crypto subsystem.

The async req mechanism is mutual exclusive with the current message
software queue handling. In the future the software queue handling can
be used as an opt-in backlog choice for users that need it. But we'll
have to do the conversion from ``void *message`` to
``struct mbox_request *req`` throughout the API.

Signed-off-by: Tudor Ambarus <tudor.ambarus@linaro.org>
---
 drivers/mailbox/mailbox.c          | 127 +++++++++++++++++++++++------
 include/linux/mailbox_client.h     |   4 +
 include/linux/mailbox_controller.h |   7 ++
 include/linux/mailbox_request.h    |  33 ++++++++
 4 files changed, 148 insertions(+), 23 deletions(-)
 create mode 100644 include/linux/mailbox_request.h

diff --git a/drivers/mailbox/mailbox.c b/drivers/mailbox/mailbox.c
index d3d26a2c9895..4fe905710458 100644
--- a/drivers/mailbox/mailbox.c
+++ b/drivers/mailbox/mailbox.c
@@ -158,6 +158,11 @@ static enum hrtimer_restart txdone_hrtimer(struct hrtimer *hrtimer)
  */
 void mbox_chan_received_data(struct mbox_chan *chan, void *mssg)
 {
+	if (chan->mbox->ops->send_request) {
+		dev_dbg(chan->mbox->dev, "Operation not supported for mailbox requests\n");
+		return;
+	}
+
 	/* No buffering the received data */
 	if (chan->cl->rx_callback)
 		chan->cl->rx_callback(chan->cl, mssg);
@@ -176,6 +181,11 @@ EXPORT_SYMBOL_GPL(mbox_chan_received_data);
  */
 void mbox_chan_txdone(struct mbox_chan *chan, int r)
 {
+	if (chan->mbox->ops->send_request) {
+		dev_dbg(chan->mbox->dev, "Operation not supported for mailbox requests\n");
+		return;
+	}
+
 	if (unlikely(!(chan->txdone_method & TXDONE_BY_IRQ))) {
 		dev_err(chan->mbox->dev,
 		       "Controller can't run the TX ticker\n");
@@ -197,6 +207,11 @@ EXPORT_SYMBOL_GPL(mbox_chan_txdone);
  */
 void mbox_client_txdone(struct mbox_chan *chan, int r)
 {
+	if (chan->mbox->ops->send_request) {
+		dev_dbg(chan->mbox->dev, "Operation not supported for mailbox requests\n");
+		return;
+	}
+
 	if (unlikely(!(chan->txdone_method & TXDONE_BY_ACK))) {
 		dev_err(chan->mbox->dev, "Client can't run the TX ticker\n");
 		return;
@@ -261,6 +276,11 @@ int mbox_send_message(struct mbox_chan *chan, void *mssg)
 	if (!chan || !chan->cl)
 		return -EINVAL;
 
+	if (chan->mbox->ops->send_request) {
+		dev_dbg(chan->mbox->dev, "Operation not supported for mailbox requests\n");
+		return -EOPNOTSUPP;
+	}
+
 	t = add_to_rbuf(chan, mssg);
 	if (t < 0) {
 		dev_err(chan->mbox->dev, "Try increasing MBOX_TX_QUEUE_LEN\n");
@@ -289,6 +309,39 @@ int mbox_send_message(struct mbox_chan *chan, void *mssg)
 }
 EXPORT_SYMBOL_GPL(mbox_send_message);
 
+int mbox_send_request(struct mbox_chan *chan, struct mbox_request *req)
+{
+	return chan->mbox->ops->send_request(chan, req);
+}
+EXPORT_SYMBOL_GPL(mbox_send_request);
+
+int mbox_wait_request(int err, struct mbox_wait *wait)
+{
+	switch (err) {
+	case -EINPROGRESS:
+		wait_for_completion(&wait->completion);
+		reinit_completion(&wait->completion);
+		err = wait->err;
+		break;
+	}
+
+	return err;
+}
+EXPORT_SYMBOL_GPL(mbox_wait_request);
+
+void mbox_request_complete(struct mbox_request *req, int err)
+{
+	struct mbox_wait *wait;
+
+	if (err == -EINPROGRESS)
+		return;
+
+	wait = req->wait;
+	wait->err = err;
+	complete(&wait->completion);
+}
+EXPORT_SYMBOL_GPL(mbox_request_complete);
+
 /**
  * mbox_flush - flush a mailbox channel
  * @chan: mailbox channel to flush
@@ -311,24 +364,44 @@ int mbox_flush(struct mbox_chan *chan, unsigned long timeout)
 		return -ENOTSUPP;
 
 	ret = chan->mbox->ops->flush(chan, timeout);
-	if (ret < 0)
+	if (ret < 0 && !chan->mbox->ops->send_request)
 		tx_tick(chan, ret);
 
 	return ret;
 }
 EXPORT_SYMBOL_GPL(mbox_flush);
 
+static int mbox_chan_startup(struct mbox_chan *chan, struct mbox_client *cl)
+{
+	struct mbox_controller *mbox = chan->mbox;
+	struct device *dev = cl->dev;
+	int ret;
+
+	if (!mbox->ops->startup)
+		return 0;
+
+	ret = mbox->ops->startup(chan);
+	if (ret) {
+		dev_err(dev, "Unable to startup the chan (%d)\n", ret);
+		mbox_free_channel(chan);
+	}
+
+	return ret;
+}
+
 static int __mbox_bind_client(struct mbox_chan *chan, struct mbox_client *cl)
 {
 	struct device *dev = cl->dev;
 	unsigned long flags;
-	int ret;
 
 	if (chan->cl || !try_module_get(chan->mbox->dev->driver->owner)) {
 		dev_dbg(dev, "%s: mailbox not free\n", __func__);
 		return -EBUSY;
 	}
 
+	if (chan->mbox->ops->send_request)
+		return mbox_chan_startup(chan, cl);
+
 	spin_lock_irqsave(&chan->lock, flags);
 	chan->msg_free = 0;
 	chan->msg_count = 0;
@@ -341,17 +414,7 @@ static int __mbox_bind_client(struct mbox_chan *chan, struct mbox_client *cl)
 
 	spin_unlock_irqrestore(&chan->lock, flags);
 
-	if (chan->mbox->ops->startup) {
-		ret = chan->mbox->ops->startup(chan);
-
-		if (ret) {
-			dev_err(dev, "Unable to startup the chan (%d)\n", ret);
-			mbox_free_channel(chan);
-			return ret;
-		}
-	}
-
-	return 0;
+	return mbox_chan_startup(chan, cl);
 }
 
 /**
@@ -474,13 +537,18 @@ EXPORT_SYMBOL_GPL(mbox_request_channel_byname);
  */
 void mbox_free_channel(struct mbox_chan *chan)
 {
+	struct mbox_controller *mbox;
 	unsigned long flags;
 
 	if (!chan || !chan->cl)
 		return;
 
-	if (chan->mbox->ops->shutdown)
-		chan->mbox->ops->shutdown(chan);
+	mbox = chan->mbox;
+	if (mbox->ops->shutdown)
+		mbox->ops->shutdown(chan);
+
+	if (mbox->ops->send_request)
+		return module_put(mbox->dev->driver->owner);
 
 	/* The queued TX requests are simply aborted, no callbacks are made */
 	spin_lock_irqsave(&chan->lock, flags);
@@ -489,7 +557,7 @@ void mbox_free_channel(struct mbox_chan *chan)
 	if (chan->txdone_method == TXDONE_BY_ACK)
 		chan->txdone_method = TXDONE_BY_POLL;
 
-	module_put(chan->mbox->dev->driver->owner);
+	module_put(mbox->dev->driver->owner);
 	spin_unlock_irqrestore(&chan->lock, flags);
 }
 EXPORT_SYMBOL_GPL(mbox_free_channel);
@@ -506,6 +574,13 @@ of_mbox_index_xlate(struct mbox_controller *mbox,
 	return &mbox->chans[ind];
 }
 
+static void mbox_controller_add_tail(struct mbox_controller *mbox)
+{
+	mutex_lock(&con_mutex);
+	list_add_tail(&mbox->node, &mbox_cons);
+	mutex_unlock(&con_mutex);
+}
+
 /**
  * mbox_controller_register - Register the mailbox controller
  * @mbox:	Pointer to the mailbox controller.
@@ -520,6 +595,17 @@ int mbox_controller_register(struct mbox_controller *mbox)
 	if (!mbox || !mbox->dev || !mbox->ops || !mbox->num_chans)
 		return -EINVAL;
 
+	if (mbox->ops->send_request && mbox->ops->send_data)
+		return -EINVAL;
+
+	if (!mbox->of_xlate)
+		mbox->of_xlate = of_mbox_index_xlate;
+
+	if (mbox->ops->send_request) {
+		mbox_controller_add_tail(mbox);
+		return 0;
+	}
+
 	if (mbox->txdone_irq)
 		txdone = TXDONE_BY_IRQ;
 	else if (mbox->txdone_poll)
@@ -549,12 +635,7 @@ int mbox_controller_register(struct mbox_controller *mbox)
 		spin_lock_init(&chan->lock);
 	}
 
-	if (!mbox->of_xlate)
-		mbox->of_xlate = of_mbox_index_xlate;
-
-	mutex_lock(&con_mutex);
-	list_add_tail(&mbox->node, &mbox_cons);
-	mutex_unlock(&con_mutex);
+	mbox_controller_add_tail(mbox);
 
 	return 0;
 }
@@ -578,7 +659,7 @@ void mbox_controller_unregister(struct mbox_controller *mbox)
 	for (i = 0; i < mbox->num_chans; i++)
 		mbox_free_channel(&mbox->chans[i]);
 
-	if (mbox->txdone_poll)
+	if (!mbox->ops->send_request && mbox->txdone_poll)
 		hrtimer_cancel(&mbox->poll_hrt);
 
 	mutex_unlock(&con_mutex);
diff --git a/include/linux/mailbox_client.h b/include/linux/mailbox_client.h
index 734694912ef7..2eb951fdee0b 100644
--- a/include/linux/mailbox_client.h
+++ b/include/linux/mailbox_client.h
@@ -9,6 +9,7 @@
 
 #include <linux/of.h>
 #include <linux/device.h>
+#include <linux/mailbox_request.h>
 
 struct mbox_chan;
 
@@ -47,4 +48,7 @@ void mbox_client_txdone(struct mbox_chan *chan, int r); /* atomic */
 bool mbox_client_peek_data(struct mbox_chan *chan); /* atomic */
 void mbox_free_channel(struct mbox_chan *chan); /* may sleep */
 
+int mbox_send_request(struct mbox_chan *chan, struct mbox_request *req);
+int mbox_wait_request(int err, struct mbox_wait *wait);
+
 #endif /* __MAILBOX_CLIENT_H */
diff --git a/include/linux/mailbox_controller.h b/include/linux/mailbox_controller.h
index 6fee33cb52f5..0582964b10a0 100644
--- a/include/linux/mailbox_controller.h
+++ b/include/linux/mailbox_controller.h
@@ -8,6 +8,7 @@
 #include <linux/hrtimer.h>
 #include <linux/device.h>
 #include <linux/completion.h>
+#include <linux/mailbox_request.h>
 
 struct mbox_chan;
 
@@ -20,6 +21,10 @@ struct mbox_chan;
  *		transmission of data is reported by the controller via
  *		mbox_chan_txdone (if it has some TX ACK irq). It must not
  *		sleep.
+ * @send_request: The API asks the MBOX controller driver to transmit a message
+ *                on the bus. The call may be in atomic context. Returns
+ *                -EINPROGRESS if data is fed into hardware, -ENOSPC when the
+ *                hardware queue is full, or zero when the request completes.
  * @flush:	Called when a client requests transmissions to be blocking but
  *		the context doesn't allow sleeping. Typically the controller
  *		will implement a busy loop waiting for the data to flush out.
@@ -45,6 +50,7 @@ struct mbox_chan;
  */
 struct mbox_chan_ops {
 	int (*send_data)(struct mbox_chan *chan, void *data);
+	int (*send_request)(struct mbox_chan *chan, struct mbox_request *req);
 	int (*flush)(struct mbox_chan *chan, unsigned long timeout);
 	int (*startup)(struct mbox_chan *chan);
 	void (*shutdown)(struct mbox_chan *chan);
@@ -131,6 +137,7 @@ int mbox_controller_register(struct mbox_controller *mbox); /* can sleep */
 void mbox_controller_unregister(struct mbox_controller *mbox); /* can sleep */
 void mbox_chan_received_data(struct mbox_chan *chan, void *data); /* atomic */
 void mbox_chan_txdone(struct mbox_chan *chan, int r); /* atomic */
+void mbox_request_complete(struct mbox_request *req, int err); /*can sleep */
 
 int devm_mbox_controller_register(struct device *dev,
 				  struct mbox_controller *mbox);
diff --git a/include/linux/mailbox_request.h b/include/linux/mailbox_request.h
new file mode 100644
index 000000000000..69e897e7d3a4
--- /dev/null
+++ b/include/linux/mailbox_request.h
@@ -0,0 +1,33 @@
+/* SPDX-License-Identifier: GPL-2.0-only */
+/*
+ * Copyright 2024 Linaro Ltd.
+ */
+
+#ifndef __MAILBOX_REQUEST_H
+#define __MAILBOX_REQUEST_H
+
+#include <linux/bits.h>
+#include <linux/types.h>
+#include <linux/completion.h>
+
+struct mbox_wait {
+	struct completion completion;
+	int err;
+};
+
+#define DECLARE_MBOX_WAIT(_wait) \
+	struct mbox_wait _wait = { \
+		COMPLETION_INITIALIZER_ONSTACK((_wait).completion), 0 }
+
+#define MBOX_REQ_MAY_SLEEP	BIT(0)
+
+struct mbox_request {
+	struct mbox_wait *wait;
+	void *tx;
+	void *rx;
+	unsigned int txlen;
+	unsigned int rxlen;
+	u32 flags;
+};
+
+#endif /* __MAILBOX_H */
-- 
2.47.0.rc1.288.g06298d1525-goog



^ permalink raw reply related	[flat|nested] 25+ messages in thread

* [PATCH v2 2/2] firmware: add exynos acpm driver
  2024-10-17 16:36 [PATCH v2 0/2] mailbox: add async request mechanism w/ a user Tudor Ambarus
  2024-10-17 16:36 ` [PATCH v2 1/2] mailbox: add async request mechanism to empower controllers w/ hw queues Tudor Ambarus
@ 2024-10-17 16:36 ` Tudor Ambarus
  2024-10-21 11:52   ` Krzysztof Kozlowski
                     ` (2 more replies)
  1 sibling, 3 replies; 25+ messages in thread
From: Tudor Ambarus @ 2024-10-17 16:36 UTC (permalink / raw)
  To: jassisinghbrar, krzk
  Cc: alim.akhtar, mst, javierm, tzimmermann, bartosz.golaszewski,
	luzmaximilian, sudeep.holla, conor.dooley, bjorn, ulf.hansson,
	linux-samsung-soc, linux-kernel, linux-arm-kernel, marcan, neal,
	alyssa, broonie, andre.draszik, willmcvicker, peter.griffin,
	kernel-team, vincent.guittot, daniel.lezcano, Tudor Ambarus

ACPM (Alive Clock and Power Manager) is a firmware that operates on the
APM (Active Power Management) module that handles overall power management
activities. ACPM and masters regard each other as independent
hardware component and communicate with each other using mailbox messages
and shared memory.

The mailbox channels are initialized based on the configuration data
found at a specific offset into the shared memory (mmio-sram). The
configuration data consists of channel id, message and queue lengths,
pointers to the RX and TX queues which are also part of the SRAM, and
whether RX works by polling or interrupts. All known clients of this
driver are using polling channels, thus the driver implements for now
just polling mode.

Add support for the exynos acpm core driver. Helper drivers will follow.
These will construct the mailbox messages in the format expected by the
firmware.

Signed-off-by: Tudor Ambarus <tudor.ambarus@linaro.org>
---
 drivers/firmware/Kconfig                    |   1 +
 drivers/firmware/Makefile                   |   1 +
 drivers/firmware/samsung/Kconfig            |  11 +
 drivers/firmware/samsung/Makefile           |   3 +
 drivers/firmware/samsung/exynos-acpm.c      | 703 ++++++++++++++++++++
 include/linux/mailbox/exynos-acpm-message.h |  21 +
 6 files changed, 740 insertions(+)
 create mode 100644 drivers/firmware/samsung/Kconfig
 create mode 100644 drivers/firmware/samsung/Makefile
 create mode 100644 drivers/firmware/samsung/exynos-acpm.c
 create mode 100644 include/linux/mailbox/exynos-acpm-message.h

diff --git a/drivers/firmware/Kconfig b/drivers/firmware/Kconfig
index 71d8b26c4103..24edb956831b 100644
--- a/drivers/firmware/Kconfig
+++ b/drivers/firmware/Kconfig
@@ -267,6 +267,7 @@ source "drivers/firmware/meson/Kconfig"
 source "drivers/firmware/microchip/Kconfig"
 source "drivers/firmware/psci/Kconfig"
 source "drivers/firmware/qcom/Kconfig"
+source "drivers/firmware/samsung/Kconfig"
 source "drivers/firmware/smccc/Kconfig"
 source "drivers/firmware/tegra/Kconfig"
 source "drivers/firmware/xilinx/Kconfig"
diff --git a/drivers/firmware/Makefile b/drivers/firmware/Makefile
index 7a8d486e718f..91efcc868a05 100644
--- a/drivers/firmware/Makefile
+++ b/drivers/firmware/Makefile
@@ -33,6 +33,7 @@ obj-y				+= efi/
 obj-y				+= imx/
 obj-y				+= psci/
 obj-y				+= qcom/
+obj-y				+= samsung/
 obj-y				+= smccc/
 obj-y				+= tegra/
 obj-y				+= xilinx/
diff --git a/drivers/firmware/samsung/Kconfig b/drivers/firmware/samsung/Kconfig
new file mode 100644
index 000000000000..f908773c1441
--- /dev/null
+++ b/drivers/firmware/samsung/Kconfig
@@ -0,0 +1,11 @@
+# SPDX-License-Identifier: GPL-2.0-only
+
+config EXYNOS_ACPM
+	tristate "Exynos ACPM (Alive Clock and Power Manager) driver support"
+	select MAILBOX
+	help
+	  ACPM is a firmware that operates on the APM (Active Power Management)
+	  module that handles overall power management activities. ACPM and
+	  masters regard each other as independent hardware component and
+	  communicate with each other using mailbox messages and shared memory.
+	  This module provides the means to communicate with the ACPM firmware.
diff --git a/drivers/firmware/samsung/Makefile b/drivers/firmware/samsung/Makefile
new file mode 100644
index 000000000000..35ff3076bbea
--- /dev/null
+++ b/drivers/firmware/samsung/Makefile
@@ -0,0 +1,3 @@
+# SPDX-License-Identifier: GPL-2.0-only
+
+obj-$(CONFIG_EXYNOS_ACPM)	+= exynos-acpm.o
diff --git a/drivers/firmware/samsung/exynos-acpm.c b/drivers/firmware/samsung/exynos-acpm.c
new file mode 100644
index 000000000000..c3ad4dc7a9e0
--- /dev/null
+++ b/drivers/firmware/samsung/exynos-acpm.c
@@ -0,0 +1,703 @@
+// SPDX-License-Identifier: GPL-2.0-only
+/*
+ * Copyright 2020 Samsung Electronics Co., Ltd.
+ * Copyright 2020 Google LLC.
+ * Copyright 2024 Linaro Ltd.
+ */
+
+#include <linux/bitfield.h>
+#include <linux/bitmap.h>
+#include <linux/bits.h>
+#include <linux/container_of.h>
+#include <linux/clk.h>
+#include <linux/delay.h>
+#include <linux/io.h>
+#include <linux/iopoll.h>
+#include <linux/mailbox_controller.h>
+#include <linux/mailbox/exynos-acpm-message.h>
+#include <linux/module.h>
+#include <linux/mutex.h>
+#include <linux/of.h>
+#include <linux/of_address.h>
+#include <linux/platform_device.h>
+#include <linux/slab.h>
+
+#define EXYNOS_ACPM_MCUCTRL		0x0	/* Mailbox Control Register */
+#define EXYNOS_ACPM_INTCR0		0x24	/* Interrupt Clear Register 0 */
+#define EXYNOS_ACPM_INTMR0		0x28	/* Interrupt Mask Register 0 */
+#define EXYNOS_ACPM_INTSR0		0x2c	/* Interrupt Status Register 0 */
+#define EXYNOS_ACPM_INTMSR0		0x30	/* Interrupt Mask Status Register 0 */
+#define EXYNOS_ACPM_INTGR1		0x40	/* Interrupt Generation Register 1 */
+#define EXYNOS_ACPM_INTMR1		0x48	/* Interrupt Mask Register 1 */
+#define EXYNOS_ACPM_INTSR1		0x4c	/* Interrupt Status Register 1 */
+#define EXYNOS_ACPM_INTMSR1		0x50	/* Interrupt Mask Status Register 1 */
+
+#define EXYNOS_ACPM_INTMR0_MASK		GENMASK(15, 0)
+#define EXYNOS_ACPM_PROTOCOL_SEQNUM	GENMASK(21, 16)
+
+/* The unit of counter is 20 us. 5000 * 20 = 100 ms */
+#define EXYNOS_ACPM_POLL_TIMEOUT	5000
+#define EXYNOS_ACPM_TX_TIMEOUT_US	500000
+
+/**
+ * struct exynos_acpm_shmem - mailbox shared memory configuration information.
+ * @reserved:	reserved for future use.
+ * @chans:	offset to array of struct exynos_acpm_shmem_chan.
+ * @reserved1:	reserved for future use.
+ * @num_chans:	number of channels.
+ */
+struct exynos_acpm_shmem {
+	u32 reserved[2];
+	u32 chans;
+	u32 reserved1[3];
+	u32 num_chans;
+};
+
+/**
+ * struct exynos_acpm_shmem_chan - descriptor of a shared memory channel.
+ *
+ * @id:			channel ID.
+ * @reserved:		reserved for future use.
+ * @rx_rear:		rear pointer of RX queue.
+ * @rx_front:		front pointer of RX queue.
+ * @rx_base:		base address of RX queue.
+ * @reserved1:		reserved for future use.
+ * @tx_rear:		rear pointer of TX queue.
+ * @tx_front:		front pointer of TX queue.
+ * @tx_base:		base address of TX queue.
+ * @qlen:		queue length. Applies to both TX/RX queues.
+ * @mlen:		message length. Applies to both TX/RX queues.
+ * @reserved2:		reserved for future use.
+ * @polling:		true when the channel works on polling.
+ */
+struct exynos_acpm_shmem_chan {
+	u32 id;
+	u32 reserved[3];
+	u32 rx_rear;
+	u32 rx_front;
+	u32 rx_base;
+	u32 reserved1[3];
+	u32 tx_rear;
+	u32 tx_front;
+	u32 tx_base;
+	u32 qlen;
+	u32 mlen;
+	u32 reserved2[2];
+	u32 polling;
+};
+
+/**
+ * struct exynos_acpm_queue - exynos acpm queue.
+ *
+ * @rear:	rear address of the queue.
+ * @front:	front address of the queue.
+ * @base:	base address of the queue.
+ */
+struct exynos_acpm_queue {
+	void __iomem *rear;
+	void __iomem *front;
+	void __iomem *base;
+};
+
+/**
+ * struct exynos_acpm_rx_data - RX queue data.
+ *
+ * @cmd:	pointer to where the data shall be saved.
+ * @response:	true if the client expects the RX data.
+ */
+struct exynos_acpm_rx_data {
+	u32 *cmd;
+	bool response;
+};
+
+#define EXYNOS_ACPM_SEQNUM_MAX    64
+
+/**
+ * struct exynos_acpm_chan - driver internal representation of a channel.
+ * @tx:		TX queue. The enqueue is done by the host.
+ *			- front index is written by the host.
+ *			- rear index is written by the firmware.
+ *
+ * @rx:		RX queue. The enqueue is done by the firmware.
+ *			- front index is written by the firmware.
+ *			- rear index is written by the host.
+ * @rx_lock:	protects RX queue. The RX queue is accessed just in
+ *		process context.
+ * @tx_lock:	protects TX queue.
+ * @qlen:	queue length. Applies to both TX/RX queues.
+ * @mlen:	message length. Applies to both TX/RX queues.
+ * @seqnum:	sequence number of the last message enqueued on TX queue.
+ * @id:		channel ID.
+ * @polling:	true when the channel works on polling.
+ * @bitmap_seqnum: bitmap that tracks the messages on the TX/RX queues.
+ * @rx_data:	internal buffer used to drain the RX queue.
+ */
+struct exynos_acpm_chan {
+	struct exynos_acpm_queue tx;
+	struct exynos_acpm_queue rx;
+	struct mutex rx_lock;
+	spinlock_t tx_lock;
+
+	unsigned int qlen;
+	unsigned int mlen;
+	u8 seqnum;
+	u8 id;
+	bool polling;
+
+	DECLARE_BITMAP(bitmap_seqnum, EXYNOS_ACPM_SEQNUM_MAX - 1);
+	struct exynos_acpm_rx_data rx_data[EXYNOS_ACPM_SEQNUM_MAX];
+};
+
+/**
+ * struct exynos_acpm - driver's private data.
+ * @shmem:	pointer to the SRAM configuration data.
+ * @chans:	pointer to the ACPM channel parameters retrieved from SRAM.
+ * @sram_base:	base address of SRAM.
+ * @regs:	mailbox registers base address.
+ * @mbox:	pointer to the mailbox controller.
+ * @wq:		pointer to workqueue.
+ * @dev:	pointer to the exynos-acpm device.
+ * @pclk:	pointer to the mailbox peripheral clock.
+ * @num_chans:	number of channels available for this controller.
+ */
+struct exynos_acpm {
+	struct exynos_acpm_shmem *shmem;
+	struct exynos_acpm_chan *chans;
+	void __iomem *sram_base;
+	void __iomem *regs;
+	struct mbox_controller *mbox;
+	struct workqueue_struct *wq;
+	struct device *dev;
+	struct clk *pclk;
+	u32 num_chans;
+};
+
+/**
+ * struct exynos_acpm_work_data - data structure representing the work.
+ * @mbox_chan:	pointer to the mailbox channel.
+ * @req:	pointer to the mailbox request.
+ * @callback:	pointer to a callback function to be invoked upon
+ *		completion of this request.
+ * @work:	describes the task to be executed.
+ */
+struct exynos_acpm_work_data {
+	struct mbox_chan *mbox_chan;
+	struct mbox_request *req;
+	void (*callback)(struct exynos_acpm_work_data *work_data, int status);
+	struct work_struct work;
+};
+
+static int exynos_acpm_get_rx(struct mbox_chan *mbox_chan,
+			      struct mbox_request *req)
+{
+	struct exynos_acpm_chan *chan = mbox_chan->con_priv;
+	struct exynos_acpm_message *tx = req->tx;
+	struct exynos_acpm_message *rx = req->rx;
+	struct exynos_acpm_rx_data *rx_data;
+	const void __iomem *base, *addr;
+	u32 rx_front, rx_seqnum, tx_seqnum, seqnum;
+	u32 i, val, mlen;
+	bool rx_set = false;
+
+	rx_front = readl_relaxed(chan->rx.front);
+	i = readl_relaxed(chan->rx.rear);
+
+	/* Bail out if RX is empty. */
+	if (i == rx_front)
+		return 0;
+
+	base = chan->rx.base;
+	mlen = chan->mlen;
+
+	tx_seqnum = FIELD_GET(EXYNOS_ACPM_PROTOCOL_SEQNUM, tx->cmd[0]);
+
+	/* Drain RX queue. */
+	do {
+		/* Read RX seqnum. */
+		addr = base + mlen * i;
+		val = readl_relaxed(addr);
+
+		rx_seqnum = FIELD_GET(EXYNOS_ACPM_PROTOCOL_SEQNUM, val);
+		if (!rx_seqnum)
+			return -EIO;
+		/*
+		 * mssg seqnum starts with value 1, whereas the driver considers
+		 * the first mssg at index 0.
+		 */
+		seqnum = rx_seqnum - 1;
+		rx_data = &chan->rx_data[seqnum];
+
+		if (rx_data->response) {
+			if (rx_seqnum == tx_seqnum) {
+				__ioread32_copy(rx->cmd, addr, req->rxlen / 4);
+				rx_set = true;
+				clear_bit(seqnum, chan->bitmap_seqnum);
+			} else {
+				/*
+				 * The RX data corresponds to another request.
+				 * Save the data to drain the queue, but don't
+				 * clear yet the bitmap. It will be cleared
+				 * after the response is copied to the request.
+				 */
+				__ioread32_copy(rx_data->cmd, addr,
+						req->rxlen / 4);
+			}
+		} else {
+			clear_bit(seqnum, chan->bitmap_seqnum);
+		}
+
+		i = (i + 1) % chan->qlen;
+	} while (i != rx_front);
+
+	/* We saved all responses, mark RX empty. */
+	writel_relaxed(rx_front, chan->rx.rear);
+
+	/* Flush SRAM posted writes. */
+	readl_relaxed(chan->rx.front);
+
+	/*
+	 * If the response was not in this iteration of the queue, check if the
+	 * RX data was previously saved.
+	 */
+	rx_data = &chan->rx_data[tx_seqnum - 1];
+	if (!rx_set && rx_data->response) {
+		rx_seqnum = FIELD_GET(EXYNOS_ACPM_PROTOCOL_SEQNUM,
+				      rx_data->cmd[0]);
+
+		if (rx_seqnum == tx_seqnum) {
+			memcpy(rx->cmd, rx_data->cmd, req->rxlen);
+			clear_bit(rx_seqnum - 1, chan->bitmap_seqnum);
+		}
+	}
+
+	return 0;
+}
+
+static int exynos_acpm_dequeue_by_polling(struct mbox_chan *mbox_chan,
+					  struct mbox_request *req)
+{
+	struct exynos_acpm_chan *chan = mbox_chan->con_priv;
+	struct exynos_acpm_message *tx = req->tx;
+	struct device *dev = mbox_chan->mbox->dev;
+	unsigned int cnt_20us = 0;
+	u32 seqnum;
+	int ret;
+
+	seqnum = FIELD_GET(EXYNOS_ACPM_PROTOCOL_SEQNUM, tx->cmd[0]);
+
+	do {
+		ret = mutex_lock_interruptible(&chan->rx_lock);
+		if (ret)
+			return ret;
+		ret = exynos_acpm_get_rx(mbox_chan, req);
+		mutex_unlock(&chan->rx_lock);
+		if (ret)
+			return ret;
+
+		if (!test_bit(seqnum - 1, chan->bitmap_seqnum)) {
+			dev_vdbg(dev, "cnt_20us = %d.\n", cnt_20us);
+			return 0;
+		}
+
+		/* Determined experimentally. */
+		usleep_range(20, 30);
+		cnt_20us++;
+	} while (cnt_20us < EXYNOS_ACPM_POLL_TIMEOUT);
+
+	dev_err(dev, "Timeout! ch:%u s:%u bitmap:%lx, cnt_20us = %d.\n",
+		chan->id, seqnum, chan->bitmap_seqnum[0], cnt_20us);
+
+	return -ETIME;
+}
+
+static void exynos_acpm_done(struct exynos_acpm_work_data *work_data, int status)
+{
+	struct mbox_request *req = work_data->req;
+
+	kfree(work_data);
+	mbox_request_complete(req, status);
+}
+
+static void exynos_acpm_work_handler(struct work_struct *work)
+{
+	struct exynos_acpm_work_data *work_data =
+		container_of(work, struct exynos_acpm_work_data, work);
+	struct mbox_chan *mbox_chan = work_data->mbox_chan;
+	int ret;
+
+	ret = exynos_acpm_dequeue_by_polling(mbox_chan, work_data->req);
+	work_data->callback(work_data, ret);
+}
+
+static struct exynos_acpm_work_data *
+	exynos_acpm_init_work(struct mbox_chan *mbox_chan,
+			      struct mbox_request *req)
+{
+	struct exynos_acpm_work_data *work_data;
+	gfp_t gfp = (req->flags & MBOX_REQ_MAY_SLEEP) ? GFP_KERNEL : GFP_ATOMIC;
+
+	work_data = kmalloc(sizeof(*work_data), gfp);
+	if (!work_data)
+		return ERR_PTR(-ENOMEM);
+
+	work_data->mbox_chan = mbox_chan;
+	work_data->req = req;
+	work_data->callback = exynos_acpm_done;
+	INIT_WORK(&work_data->work, exynos_acpm_work_handler);
+
+	return work_data;
+}
+
+static void exynos_acpm_prepare_request(struct mbox_chan *mbox_chan,
+					struct mbox_request *req)
+{
+	struct exynos_acpm_chan *chan = mbox_chan->con_priv;
+	struct exynos_acpm_message *tx = req->tx;
+	struct exynos_acpm_rx_data *rx_data;
+
+	/* Prevent chan->seqnum from being re-used */
+	do {
+		if (++chan->seqnum == EXYNOS_ACPM_SEQNUM_MAX)
+			chan->seqnum = 1;
+	} while (test_bit(chan->seqnum - 1, chan->bitmap_seqnum));
+
+	tx->cmd[0] |= FIELD_PREP(EXYNOS_ACPM_PROTOCOL_SEQNUM, chan->seqnum);
+
+	/* Clear data for upcoming responses */
+	rx_data = &chan->rx_data[chan->seqnum - 1];
+	memset(rx_data->cmd, 0, sizeof(*(rx_data->cmd)) * chan->mlen);
+	if (req->rx)
+		rx_data->response = true;
+
+	/* Flag the index based on seqnum. (seqnum: 1~63, bitmap: 0~62) */
+	set_bit(chan->seqnum - 1, chan->bitmap_seqnum);
+}
+
+static int exynos_acpm_wait_for_queue_slots(struct mbox_chan *mbox_chan,
+					    u32 next_tx_front)
+{
+	struct exynos_acpm_chan *chan = mbox_chan->con_priv;
+	struct device *dev = mbox_chan->mbox->dev;
+	u32 val, ret;
+
+	/*
+	 * Wait for RX front to keep up with TX front. Make sure there's at
+	 * least one element between them.
+	 */
+	ret = readl_relaxed_poll_timeout_atomic(chan->rx.front, val,
+						next_tx_front != val, 1,
+						EXYNOS_ACPM_TX_TIMEOUT_US);
+	if (ret) {
+		dev_err(dev, "RX front can not keep up with TX front.\n");
+		return ret;
+	}
+
+	ret = readl_relaxed_poll_timeout_atomic(chan->tx.rear, val,
+						next_tx_front != val, 1,
+						EXYNOS_ACPM_TX_TIMEOUT_US);
+	if (ret)
+		dev_err(dev, "TX queue is full.\n");
+
+	return ret;
+}
+
+static int exynos_acpm_send_request(struct mbox_chan *mbox_chan,
+				    struct mbox_request *req)
+{
+	struct exynos_acpm *exynos_acpm = dev_get_drvdata(mbox_chan->mbox->dev);
+	struct exynos_acpm_chan *chan = mbox_chan->con_priv;
+	struct exynos_acpm_message *tx = req->tx;
+	struct exynos_acpm_work_data *work_data;
+	u32 idx, tx_front;
+	unsigned long flags;
+	int ret;
+
+	if (!tx || !tx->cmd || req->txlen > chan->mlen ||
+	    req->rxlen > chan->mlen)
+		return -EINVAL;
+
+	work_data = exynos_acpm_init_work(mbox_chan, req);
+	if (IS_ERR(work_data))
+		return PTR_ERR(work_data);
+
+	spin_lock_irqsave(&chan->tx_lock, flags);
+
+	tx_front = readl_relaxed(chan->tx.front);
+	idx = (tx_front + 1) % chan->qlen;
+
+	ret = exynos_acpm_wait_for_queue_slots(mbox_chan, idx);
+	if (ret)
+		goto exit;
+
+	exynos_acpm_prepare_request(mbox_chan, req);
+
+	/* Write TX command. */
+	__iowrite32_copy(chan->tx.base + chan->mlen * tx_front, tx->cmd,
+			 req->txlen / 4);
+
+	/* Advance TX front. */
+	writel_relaxed(idx, chan->tx.front);
+
+	/* Flush SRAM posted writes. */
+	readl_relaxed(chan->tx.front);
+
+	/* Generate ACPM interrupt. */
+	writel_relaxed(BIT(chan->id), exynos_acpm->regs + EXYNOS_ACPM_INTGR1);
+
+	/* Flush mailbox controller posted writes. */
+	readl_relaxed(exynos_acpm->regs + EXYNOS_ACPM_MCUCTRL);
+
+	spin_unlock_irqrestore(&chan->tx_lock, flags);
+
+	queue_work(exynos_acpm->wq, &work_data->work);
+
+	return -EINPROGRESS;
+exit:
+	spin_unlock_irqrestore(&chan->tx_lock, flags);
+	kfree(work_data);
+	return ret;
+}
+
+static int exynos_acpm_chan_startup(struct mbox_chan *mbox_chan)
+{
+	struct exynos_acpm_chan *chan = mbox_chan->con_priv;
+
+	if (!chan->polling) {
+		dev_err(mbox_chan->mbox->dev, "IRQs not supported.\n");
+		return -EINVAL;
+	}
+
+	return 0;
+}
+
+static const struct mbox_chan_ops exynos_acpm_chan_ops = {
+	.send_request = exynos_acpm_send_request,
+	.startup = exynos_acpm_chan_startup,
+};
+
+static void __iomem *exynos_acpm_get_iomem_addr(void __iomem *base,
+						void __iomem *addr)
+{
+	u32 offset;
+
+	offset = readl_relaxed(addr);
+	return base + offset;
+}
+
+static void exynos_acpm_rx_queue_init(struct exynos_acpm *exynos_acpm,
+				      struct exynos_acpm_shmem_chan *shmem_chan,
+				      struct exynos_acpm_queue *rx)
+{
+	void __iomem *base = exynos_acpm->sram_base;
+
+	rx->base = exynos_acpm_get_iomem_addr(base, &shmem_chan->tx_base);
+	rx->front = exynos_acpm_get_iomem_addr(base, &shmem_chan->tx_front);
+	rx->rear = exynos_acpm_get_iomem_addr(base, &shmem_chan->tx_rear);
+}
+
+static void exynos_acpm_tx_queue_init(struct exynos_acpm *exynos_acpm,
+				      struct exynos_acpm_shmem_chan *shmem_chan,
+				      struct exynos_acpm_queue *tx)
+{
+	void __iomem *base = exynos_acpm->sram_base;
+
+	tx->base = exynos_acpm_get_iomem_addr(base, &shmem_chan->rx_base);
+	tx->front = exynos_acpm_get_iomem_addr(base, &shmem_chan->rx_front);
+	tx->rear = exynos_acpm_get_iomem_addr(base, &shmem_chan->rx_rear);
+}
+
+static int exynos_acpm_alloc_cmds(struct exynos_acpm *exynos_acpm,
+				  struct exynos_acpm_chan *chan)
+{
+	struct device *dev = exynos_acpm->dev;
+	struct exynos_acpm_rx_data *rx_data;
+	unsigned int mlen = chan->mlen;
+	int i;
+
+	for (i = 0; i < EXYNOS_ACPM_SEQNUM_MAX; i++) {
+		rx_data = &chan->rx_data[i];
+		rx_data->cmd = devm_kcalloc(dev, mlen, sizeof(*(rx_data->cmd)),
+					    GFP_KERNEL);
+		if (!rx_data->cmd)
+			return -ENOMEM;
+	}
+
+	return 0;
+}
+
+static int exynos_acpm_chans_init(struct exynos_acpm *exynos_acpm)
+{
+	struct exynos_acpm_shmem_chan *shmem_chans, *shmem_chan;
+	struct exynos_acpm_shmem *shmem = exynos_acpm->shmem;
+	struct mbox_chan *mbox_chan, *mbox_chans;
+	struct exynos_acpm_chan *chan, *chans;
+	struct device *dev = exynos_acpm->dev;
+	int i, ret;
+
+	exynos_acpm->num_chans = readl_relaxed(&shmem->num_chans);
+
+	mbox_chans = devm_kcalloc(dev, exynos_acpm->num_chans,
+				  sizeof(*mbox_chans), GFP_KERNEL);
+	if (!mbox_chans)
+		return -ENOMEM;
+
+	chans = devm_kcalloc(dev, exynos_acpm->num_chans, sizeof(*chans),
+			     GFP_KERNEL);
+	if (!chans)
+		return -ENOMEM;
+
+	shmem_chans = exynos_acpm_get_iomem_addr(exynos_acpm->sram_base,
+						 &shmem->chans);
+
+	for (i = 0; i < exynos_acpm->num_chans; i++) {
+		shmem_chan = &shmem_chans[i];
+		mbox_chan = &mbox_chans[i];
+		chan = &chans[i];
+
+		/* Set parameters for the mailbox channel. */
+		mbox_chan->con_priv = chan;
+		mbox_chan->mbox = exynos_acpm->mbox;
+
+		/* Set parameters for the ACPM channel. */
+		chan->mlen = readl_relaxed(&shmem_chan->mlen);
+
+		ret = exynos_acpm_alloc_cmds(exynos_acpm, chan);
+		if (ret)
+			return ret;
+
+		chan->polling = readl_relaxed(&shmem_chan->polling);
+		chan->id = readl_relaxed(&shmem_chan->id);
+		chan->qlen = readl_relaxed(&shmem_chan->qlen);
+
+		exynos_acpm_rx_queue_init(exynos_acpm, shmem_chan, &chan->rx);
+		exynos_acpm_tx_queue_init(exynos_acpm, shmem_chan, &chan->tx);
+
+		mutex_init(&chan->rx_lock);
+		spin_lock_init(&chan->tx_lock);
+
+		dev_vdbg(dev, "ID = %d poll = %d, mlen = %d, qlen = %d\n",
+			 chan->id, chan->polling, chan->mlen, chan->qlen);
+	}
+
+	/* Save pointers to the ACPM and mailbox channels. */
+	exynos_acpm->mbox->chans = mbox_chans;
+	exynos_acpm->chans = chans;
+
+	return 0;
+}
+
+static const struct of_device_id exynos_acpm_match[] = {
+	{ .compatible = "google,gs101-acpm" },
+	{},
+};
+MODULE_DEVICE_TABLE(of, exynos_acpm_match);
+
+static int exynos_acpm_probe(struct platform_device *pdev)
+{
+	struct device_node *node = pdev->dev.of_node;
+	struct device *dev = &pdev->dev;
+	struct exynos_acpm *exynos_acpm;
+	struct mbox_controller *mbox;
+	struct device_node *shmem;
+	resource_size_t size;
+	struct resource res;
+	const __be32 *prop;
+	int ret;
+
+	exynos_acpm = devm_kzalloc(dev, sizeof(*exynos_acpm), GFP_KERNEL);
+	if (!exynos_acpm)
+		return -ENOMEM;
+
+	mbox = devm_kzalloc(dev, sizeof(*mbox), GFP_KERNEL);
+	if (!mbox)
+		return -ENOMEM;
+
+	exynos_acpm->regs = devm_platform_ioremap_resource(pdev, 0);
+	if (IS_ERR(exynos_acpm->regs))
+		return PTR_ERR(exynos_acpm->regs);
+
+	shmem = of_parse_phandle(node, "shmem", 0);
+	ret = of_address_to_resource(shmem, 0, &res);
+	of_node_put(shmem);
+	if (ret) {
+		dev_err(dev, "Failed to get shared memory.\n");
+		return ret;
+	}
+
+	size = resource_size(&res);
+	exynos_acpm->sram_base = devm_ioremap(dev, res.start, size);
+	if (!exynos_acpm->sram_base) {
+		dev_err(dev, "Failed to ioremap shared memory.\n");
+		return -ENOMEM;
+	}
+
+	prop = of_get_property(node, "initdata-base", NULL);
+	if (!prop) {
+		dev_err(dev, "Parsing initdata_base failed.\n");
+		return -EINVAL;
+	}
+
+	exynos_acpm->pclk = devm_clk_get(dev, "pclk");
+	if (IS_ERR(exynos_acpm->pclk)) {
+		dev_err(dev, "Missing peripheral clock.\n");
+		return PTR_ERR(exynos_acpm->pclk);
+	}
+
+	ret = clk_prepare_enable(exynos_acpm->pclk);
+	if (ret) {
+		dev_err(dev, "Failed to enable the peripheral clock.\n");
+		return ret;
+	}
+
+	exynos_acpm->wq = alloc_workqueue("exynos-acpm-wq", 0, 0);
+	if (!exynos_acpm->wq)
+		return -ENOMEM;
+
+	exynos_acpm->dev = dev;
+	exynos_acpm->mbox = mbox;
+	exynos_acpm->shmem = exynos_acpm->sram_base + be32_to_cpup(prop);
+
+	ret = exynos_acpm_chans_init(exynos_acpm);
+	if (ret)
+		return ret;
+
+	mbox->num_chans = exynos_acpm->num_chans;
+	mbox->dev = dev;
+	mbox->ops = &exynos_acpm_chan_ops;
+
+	platform_set_drvdata(pdev, exynos_acpm);
+
+	/* Mask out all interrupts. We support just polling channels for now. */
+	writel_relaxed(EXYNOS_ACPM_INTMR0_MASK,
+		       exynos_acpm->regs + EXYNOS_ACPM_INTMR0);
+
+	ret = devm_mbox_controller_register(dev, mbox);
+	if (ret)
+		dev_err(dev, "Failed to register mbox_controller(%d).\n", ret);
+
+	return ret;
+}
+
+static void exynos_acpm_remove(struct platform_device *pdev)
+{
+	struct exynos_acpm *exynos_acpm = platform_get_drvdata(pdev);
+
+	flush_workqueue(exynos_acpm->wq);
+	destroy_workqueue(exynos_acpm->wq);
+	clk_disable_unprepare(exynos_acpm->pclk);
+}
+
+static struct platform_driver exynos_acpm_driver = {
+	.probe	= exynos_acpm_probe,
+	.remove = exynos_acpm_remove,
+	.driver	= {
+		.name = "exynos-acpm",
+		.owner	= THIS_MODULE,
+		.of_match_table	= exynos_acpm_match,
+	},
+};
+module_platform_driver(exynos_acpm_driver);
+
+MODULE_AUTHOR("Tudor Ambarus <tudor.ambarus@linaro.org>");
+MODULE_DESCRIPTION("EXYNOS ACPM mailbox driver");
+MODULE_LICENSE("GPL");
diff --git a/include/linux/mailbox/exynos-acpm-message.h b/include/linux/mailbox/exynos-acpm-message.h
new file mode 100644
index 000000000000..3799868c40b8
--- /dev/null
+++ b/include/linux/mailbox/exynos-acpm-message.h
@@ -0,0 +1,21 @@
+/* SPDX-License-Identifier: GPL-2.0-only */
+/*
+ * Copyright 2024 Linaro Ltd.
+ */
+
+#ifndef _LINUX_EXYNOS_ACPM_MESSAGE_H_
+#define _LINUX_EXYNOS_ACPM_MESSAGE_H_
+
+#include <linux/types.h>
+
+/**
+ * struct exynos_acpm_message - exynos ACPM mailbox message format.
+ * @cmd: pointer to u32 command.
+ * @len: length of the command.
+ */
+struct exynos_acpm_message {
+	u32 *cmd;
+	size_t len;
+};
+
+#endif /* _LINUX_EXYNOS_ACPM_MESSAGE_H_ */
-- 
2.47.0.rc1.288.g06298d1525-goog



^ permalink raw reply related	[flat|nested] 25+ messages in thread

* Re: [PATCH v2 1/2] mailbox: add async request mechanism to empower controllers w/ hw queues
  2024-10-17 16:36 ` [PATCH v2 1/2] mailbox: add async request mechanism to empower controllers w/ hw queues Tudor Ambarus
@ 2024-10-18  4:17   ` Jassi Brar
  2024-10-18  7:49     ` Tudor Ambarus
  0 siblings, 1 reply; 25+ messages in thread
From: Jassi Brar @ 2024-10-18  4:17 UTC (permalink / raw)
  To: Tudor Ambarus
  Cc: krzk, alim.akhtar, mst, javierm, tzimmermann, bartosz.golaszewski,
	luzmaximilian, sudeep.holla, conor.dooley, bjorn, ulf.hansson,
	linux-samsung-soc, linux-kernel, linux-arm-kernel, marcan, neal,
	alyssa, broonie, andre.draszik, willmcvicker, peter.griffin,
	kernel-team, vincent.guittot, daniel.lezcano

On Thu, Oct 17, 2024 at 11:36 AM Tudor Ambarus <tudor.ambarus@linaro.org> wrote:
>
> Current form of the mailbox framework doesn't allow controllers to benefit
> of their hardware queue capabilities as the framework handles a single
> active request at a time.
>
> The active request is considered completed when TX completes. But it seems
> that TX is not in direct relation with RX,
>
Correct, and it is not meant to be.
You are assuming there is always an RX in response to a TX, which is
not the case. Many platforms just send a message and only need to know
when it is sent. Many platforms only listen for incoming messages.
Many platforms have TX and RX but not as parts of one exchange. In
fact, only minority of platforms expect RX after each TX. Btw, what if
some platform sends only and always after each receive? For these
reasons, it is left to the user to tie an incoming RX to some previous
TX, or not.

Regards.
Jassi


^ permalink raw reply	[flat|nested] 25+ messages in thread

* Re: [PATCH v2 1/2] mailbox: add async request mechanism to empower controllers w/ hw queues
  2024-10-18  4:17   ` Jassi Brar
@ 2024-10-18  7:49     ` Tudor Ambarus
  2024-10-21  6:18       ` Tudor Ambarus
  0 siblings, 1 reply; 25+ messages in thread
From: Tudor Ambarus @ 2024-10-18  7:49 UTC (permalink / raw)
  To: Jassi Brar
  Cc: krzk, alim.akhtar, mst, javierm, tzimmermann, bartosz.golaszewski,
	luzmaximilian, sudeep.holla, conor.dooley, bjorn, ulf.hansson,
	linux-samsung-soc, linux-kernel, linux-arm-kernel, marcan, neal,
	alyssa, broonie, andre.draszik, willmcvicker, peter.griffin,
	kernel-team, vincent.guittot, daniel.lezcano

Hi, Jassi,

Thanks for the review!

On 10/18/24 5:17 AM, Jassi Brar wrote:
> On Thu, Oct 17, 2024 at 11:36 AM Tudor Ambarus <tudor.ambarus@linaro.org> wrote:
>>
>> Current form of the mailbox framework doesn't allow controllers to benefit
>> of their hardware queue capabilities as the framework handles a single
>> active request at a time.
>>
>> The active request is considered completed when TX completes. But it seems
>> that TX is not in direct relation with RX,
>>
> Correct, and it is not meant to be.
> You are assuming there is always an RX in response to a TX, which is

Not really. If there's no response expected, clients can set req->rx to
NULL. Then the controllers know that no response is expected and can
complete the request when TX completes.

> not the case. Many platforms just send a message and only need to know
> when it is sent. Many platforms only listen for incoming messages.

these 2 cases are covered with the req approach.

> Many platforms have TX and RX but not as parts of one exchange. In

I don't think I understand this case. Is it related to what you describe
below?

> fact, only minority of platforms expect RX after each TX. Btw, what if

Right, I noticed.

> some platform sends only and always after each receive? For these

This case is covered as well with the req approach. One just needs to
serialize the requests:

ret = mbox_send_request(dc->mbox_chan, req1);
ret = mbox_wait_request(ret, req1->wait);
if (ret)
	return ret;

// req1 completed, send req2
ret = mbox_send_request(dc->mbox_chan, req2);
ret = mbox_wait_request(ret, req2->wait);
if (ret)
	return ret;
	
This shall work regardless if the client expects a response or not. If
no response is expected, but just a TX completion, then the client can
set req->rx = NULL.

> reasons, it is left to the user to tie an incoming RX to some previous
> TX, or not.

It's possible I haven't covered all the cases, but I'm willing to
understand them and come up with a new version with your help, where I
address all the concerns.

Cheers,
ta


^ permalink raw reply	[flat|nested] 25+ messages in thread

* Re: [PATCH v2 1/2] mailbox: add async request mechanism to empower controllers w/ hw queues
  2024-10-18  7:49     ` Tudor Ambarus
@ 2024-10-21  6:18       ` Tudor Ambarus
  2024-10-21 16:32         ` Jassi Brar
  0 siblings, 1 reply; 25+ messages in thread
From: Tudor Ambarus @ 2024-10-21  6:18 UTC (permalink / raw)
  To: Jassi Brar
  Cc: krzk, alim.akhtar, mst, javierm, tzimmermann, bartosz.golaszewski,
	luzmaximilian, sudeep.holla, conor.dooley, bjorn, ulf.hansson,
	linux-samsung-soc, linux-kernel, linux-arm-kernel, marcan, neal,
	alyssa, broonie, andre.draszik, willmcvicker, peter.griffin,
	kernel-team, vincent.guittot, daniel.lezcano

Hi, Jassi,

On 10/18/24 8:49 AM, Tudor Ambarus wrote:
>>> The active request is considered completed when TX completes. But it seems
>>> that TX is not in direct relation with RX,
>>>
>> Correct, and it is not meant to be.
>> You are assuming there is always an RX in response to a TX, which is
> Not really. If there's no response expected, clients can set req->rx to
> NULL. Then the controllers know that no response is expected and can
> complete the request when TX completes.
> 
>> not the case. Many platforms just send a message and only need to know
>> when it is sent. Many platforms only listen for incoming messages.
> these 2 cases are covered with the req approach.
> 
>> Many platforms have TX and RX but not as parts of one exchange. In
> I don't think I understand this case. Is it related to what you describe
> below?
> 
>> fact, only minority of platforms expect RX after each TX. Btw, what if
> Right, I noticed.
> 
>> some platform sends only and always after each receive? For these
> This case is covered as well with the req approach. One just needs to
> serialize the requests:
> 
> ret = mbox_send_request(dc->mbox_chan, req1);
> ret = mbox_wait_request(ret, req1->wait);
> if (ret)
> 	return ret;
> 
> // req1 completed, send req2
> ret = mbox_send_request(dc->mbox_chan, req2);
> ret = mbox_wait_request(ret, req2->wait);
> if (ret)
> 	return ret;
> 	
> This shall work regardless if the client expects a response or not. If
> no response is expected, but just a TX completion, then the client can
> set req->rx = NULL.
> 
>> reasons, it is left to the user to tie an incoming RX to some previous
>> TX, or not.

Is there a specific driver that I can look at in order to understand the
case where RX is not tied to TX? It will speed me up a little.
Also, if you think there's a better way to enable controllers to manage
their hardware queues, please say.

Thanks,
ta


^ permalink raw reply	[flat|nested] 25+ messages in thread

* Re: [PATCH v2 2/2] firmware: add exynos acpm driver
  2024-10-17 16:36 ` [PATCH v2 2/2] firmware: add exynos acpm driver Tudor Ambarus
@ 2024-10-21 11:52   ` Krzysztof Kozlowski
  2024-10-21 14:12     ` Tudor Ambarus
  2024-10-21 14:52   ` Tudor Ambarus
  2024-10-21 16:47   ` Alim Akhtar
  2 siblings, 1 reply; 25+ messages in thread
From: Krzysztof Kozlowski @ 2024-10-21 11:52 UTC (permalink / raw)
  To: Tudor Ambarus, jassisinghbrar
  Cc: alim.akhtar, mst, javierm, tzimmermann, bartosz.golaszewski,
	luzmaximilian, sudeep.holla, conor.dooley, bjorn, ulf.hansson,
	linux-samsung-soc, linux-kernel, linux-arm-kernel, marcan, neal,
	alyssa, broonie, andre.draszik, willmcvicker, peter.griffin,
	kernel-team, vincent.guittot, daniel.lezcano

On 17/10/2024 18:36, Tudor Ambarus wrote:
> ACPM (Alive Clock and Power Manager) is a firmware that operates on the
> APM (Active Power Management) module that handles overall power management
> activities. ACPM and masters regard each other as independent
> hardware component and communicate with each other using mailbox messages
> and shared memory.
> 
> The mailbox channels are initialized based on the configuration data
> found at a specific offset into the shared memory (mmio-sram). The
> configuration data consists of channel id, message and queue lengths,
> pointers to the RX and TX queues which are also part of the SRAM, and
> whether RX works by polling or interrupts. All known clients of this
> driver are using polling channels, thus the driver implements for now
> just polling mode.
> 
> Add support for the exynos acpm core driver. Helper drivers will follow.
> These will construct the mailbox messages in the format expected by the
> firmware.

I skimmed through the driver and I do not understand why this is
firmware. You are implementing a mailbox provider/controller.

I did not perform full review yet, just skimmed over the code. I will
take a look a bit later.

> 
> Signed-off-by: Tudor Ambarus <tudor.ambarus@linaro.org>
> ---
>  drivers/firmware/Kconfig                    |   1 +
>  drivers/firmware/Makefile                   |   1 +
>  drivers/firmware/samsung/Kconfig            |  11 +
>  drivers/firmware/samsung/Makefile           |   3 +
>  drivers/firmware/samsung/exynos-acpm.c      | 703 ++++++++++++++++++++

Please add directory to the Samsung Exynos SoC maintainer entry. I also
encourage adding separate entry for the driver where you would be listed
as maintainer.

There is no firmware tree, so this will be going via Samsung SoC.

>  include/linux/mailbox/exynos-acpm-message.h |  21 +
>  6 files changed, 740 insertions(+)
>  create mode 100644 drivers/firmware/samsung/Kconfig
>  create mode 100644 drivers/firmware/samsung/Makefile
>  create mode 100644 drivers/firmware/samsung/exynos-acpm.c
>  create mode 100644 include/linux/mailbox/exynos-acpm-message.h
> 
> diff --git a/drivers/firmware/Kconfig b/drivers/firmware/Kconfig
> index 71d8b26c4103..24edb956831b 100644
> --- a/drivers/firmware/Kconfig
> +++ b/drivers/firmware/Kconfig
> @@ -267,6 +267,7 @@ source "drivers/firmware/meson/Kconfig"
>  source "drivers/firmware/microchip/Kconfig"
>  source "drivers/firmware/psci/Kconfig"
>  source "drivers/firmware/qcom/Kconfig"
> +source "drivers/firmware/samsung/Kconfig"
>  source "drivers/firmware/smccc/Kconfig"
>  source "drivers/firmware/tegra/Kconfig"
>  source "drivers/firmware/xilinx/Kconfig"
> diff --git a/drivers/firmware/Makefile b/drivers/firmware/Makefile
> index 7a8d486e718f..91efcc868a05 100644
> --- a/drivers/firmware/Makefile
> +++ b/drivers/firmware/Makefile
> @@ -33,6 +33,7 @@ obj-y				+= efi/
>  obj-y				+= imx/
>  obj-y				+= psci/
>  obj-y				+= qcom/
> +obj-y				+= samsung/
>  obj-y				+= smccc/
>  obj-y				+= tegra/
>  obj-y				+= xilinx/
> diff --git a/drivers/firmware/samsung/Kconfig b/drivers/firmware/samsung/Kconfig
> new file mode 100644
> index 000000000000..f908773c1441
> --- /dev/null
> +++ b/drivers/firmware/samsung/Kconfig
> @@ -0,0 +1,11 @@
> +# SPDX-License-Identifier: GPL-2.0-only
> +
> +config EXYNOS_ACPM
> +	tristate "Exynos ACPM (Alive Clock and Power Manager) driver support"

depends ARCH_EXYNOS || COMPILE_TEST

Please also send a arm64 defconfig change making it a module.

> +	select MAILBOX
> +	help
> +	  ACPM is a firmware that operates on the APM (Active Power Management)
> +	  module that handles overall power management activities. ACPM and
> +	  masters regard each other as independent hardware component and
> +	  communicate with each other using mailbox messages and shared memory.
> +	  This module provides the means to communicate with the ACPM firmware.
> diff --git a/drivers/firmware/samsung/Makefile b/drivers/firmware/samsung/Makefile
> new file mode 100644
> index 000000000000..35ff3076bbea
> --- /dev/null
> +++ b/drivers/firmware/samsung/Makefile
> @@ -0,0 +1,3 @@
> +# SPDX-License-Identifier: GPL-2.0-only
> +
> +obj-$(CONFIG_EXYNOS_ACPM)	+= exynos-acpm.o
> diff --git a/drivers/firmware/samsung/exynos-acpm.c b/drivers/firmware/samsung/exynos-acpm.c
> new file mode 100644
> index 000000000000..c3ad4dc7a9e0
> --- /dev/null
> +++ b/drivers/firmware/samsung/exynos-acpm.c
> @@ -0,0 +1,703 @@
> +// SPDX-License-Identifier: GPL-2.0-only
> +/*
> + * Copyright 2020 Samsung Electronics Co., Ltd.
> + * Copyright 2020 Google LLC.
> + * Copyright 2024 Linaro Ltd.
> + */
> +
> +#include <linux/bitfield.h>
> +#include <linux/bitmap.h>
> +#include <linux/bits.h>
> +#include <linux/container_of.h>
> +#include <linux/clk.h>
> +#include <linux/delay.h>
> +#include <linux/io.h>
> +#include <linux/iopoll.h>
> +#include <linux/mailbox_controller.h>
> +#include <linux/mailbox/exynos-acpm-message.h>
> +#include <linux/module.h>
> +#include <linux/mutex.h>
> +#include <linux/of.h>
> +#include <linux/of_address.h>
> +#include <linux/platform_device.h>
> +#include <linux/slab.h>
> +
> +#define EXYNOS_ACPM_MCUCTRL		0x0	/* Mailbox Control Register */
> +#define EXYNOS_ACPM_INTCR0		0x24	/* Interrupt Clear Register 0 */
> +#define EXYNOS_ACPM_INTMR0		0x28	/* Interrupt Mask Register 0 */
> +#define EXYNOS_ACPM_INTSR0		0x2c	/* Interrupt Status Register 0 */
> +#define EXYNOS_ACPM_INTMSR0		0x30	/* Interrupt Mask Status Register 0 */
> +#define EXYNOS_ACPM_INTGR1		0x40	/* Interrupt Generation Register 1 */
> +#define EXYNOS_ACPM_INTMR1		0x48	/* Interrupt Mask Register 1 */
> +#define EXYNOS_ACPM_INTSR1		0x4c	/* Interrupt Status Register 1 */
> +#define EXYNOS_ACPM_INTMSR1		0x50	/* Interrupt Mask Status Register 1 */
> +
> +#define EXYNOS_ACPM_INTMR0_MASK		GENMASK(15, 0)
> +#define EXYNOS_ACPM_PROTOCOL_SEQNUM	GENMASK(21, 16)
> +
> +/* The unit of counter is 20 us. 5000 * 20 = 100 ms */
> +#define EXYNOS_ACPM_POLL_TIMEOUT	5000
> +#define EXYNOS_ACPM_TX_TIMEOUT_US	500000
> +
> +/**
> + * struct exynos_acpm_shmem - mailbox shared memory configuration information.
> + * @reserved:	reserved for future use.
> + * @chans:	offset to array of struct exynos_acpm_shmem_chan.
> + * @reserved1:	reserved for future use.
> + * @num_chans:	number of channels.
> + */
> +struct exynos_acpm_shmem {
> +	u32 reserved[2];
> +	u32 chans;
> +	u32 reserved1[3];
> +	u32 num_chans;
> +};
> +
> +/**
> + * struct exynos_acpm_shmem_chan - descriptor of a shared memory channel.
> + *
> + * @id:			channel ID.
> + * @reserved:		reserved for future use.
> + * @rx_rear:		rear pointer of RX queue.
> + * @rx_front:		front pointer of RX queue.
> + * @rx_base:		base address of RX queue.
> + * @reserved1:		reserved for future use.
> + * @tx_rear:		rear pointer of TX queue.
> + * @tx_front:		front pointer of TX queue.
> + * @tx_base:		base address of TX queue.
> + * @qlen:		queue length. Applies to both TX/RX queues.
> + * @mlen:		message length. Applies to both TX/RX queues.
> + * @reserved2:		reserved for future use.
> + * @polling:		true when the channel works on polling.
> + */
> +struct exynos_acpm_shmem_chan {
> +	u32 id;
> +	u32 reserved[3];
> +	u32 rx_rear;
> +	u32 rx_front;
> +	u32 rx_base;
> +	u32 reserved1[3];
> +	u32 tx_rear;
> +	u32 tx_front;
> +	u32 tx_base;
> +	u32 qlen;
> +	u32 mlen;
> +	u32 reserved2[2];
> +	u32 polling;

Why are you storing addresses as u32? Shouldn't these be __iomem*?

I also cannot find any piece of code setting several of above, e.g. tx_base

> +};
> +
> +/**
> + * struct exynos_acpm_queue - exynos acpm queue.
> + *
> + * @rear:	rear address of the queue.
> + * @front:	front address of the queue.
> + * @base:	base address of the queue.
> + */
> +struct exynos_acpm_queue {
> +	void __iomem *rear;
> +	void __iomem *front;
> +	void __iomem *base;
> +};
> +
> +/**
> + * struct exynos_acpm_rx_data - RX queue data.
> + *
> + * @cmd:	pointer to where the data shall be saved.
> + * @response:	true if the client expects the RX data.
> + */
> +struct exynos_acpm_rx_data {
> +	u32 *cmd;
> +	bool response;
> +};
> +
> +#define EXYNOS_ACPM_SEQNUM_MAX    64
> +

...


> +	if (IS_ERR(work_data))
> +		return PTR_ERR(work_data);
> +
> +	spin_lock_irqsave(&chan->tx_lock, flags);
> +
> +	tx_front = readl_relaxed(chan->tx.front);
> +	idx = (tx_front + 1) % chan->qlen;
> +
> +	ret = exynos_acpm_wait_for_queue_slots(mbox_chan, idx);
> +	if (ret)
> +		goto exit;
> +
> +	exynos_acpm_prepare_request(mbox_chan, req);
> +
> +	/* Write TX command. */
> +	__iowrite32_copy(chan->tx.base + chan->mlen * tx_front, tx->cmd,
> +			 req->txlen / 4);
> +
> +	/* Advance TX front. */
> +	writel_relaxed(idx, chan->tx.front);
> +
> +	/* Flush SRAM posted writes. */
> +	readl_relaxed(chan->tx.front);
> +

How does this flush work? Maybe you just miss here barries (s/relaxed//)?

> +	/* Generate ACPM interrupt. */
> +	writel_relaxed(BIT(chan->id), exynos_acpm->regs + EXYNOS_ACPM_INTGR1);
> +
> +	/* Flush mailbox controller posted writes. */
> +	readl_relaxed(exynos_acpm->regs + EXYNOS_ACPM_MCUCTRL);
> +
> +	spin_unlock_irqrestore(&chan->tx_lock, flags);
> +
> +	queue_work(exynos_acpm->wq, &work_data->work);
> +
> +	return -EINPROGRESS;
> +exit:
> +	spin_unlock_irqrestore(&chan->tx_lock, flags);
> +	kfree(work_data);
> +	return ret;
> +}
> +
> +static int exynos_acpm_chan_startup(struct mbox_chan *mbox_chan)
> +{
> +	struct exynos_acpm_chan *chan = mbox_chan->con_priv;
> +
> +	if (!chan->polling) {
> +		dev_err(mbox_chan->mbox->dev, "IRQs not supported.\n");
> +		return -EINVAL;
> +	}
> +
> +	return 0;
> +}
> +
> +static const struct mbox_chan_ops exynos_acpm_chan_ops = {
> +	.send_request = exynos_acpm_send_request,
> +	.startup = exynos_acpm_chan_startup,
> +};
> +
> +static void __iomem *exynos_acpm_get_iomem_addr(void __iomem *base,
> +						void __iomem *addr)
> +{
> +	u32 offset;
> +
> +	offset = readl_relaxed(addr);
> +	return base + offset;
> +}
> +
> +static void exynos_acpm_rx_queue_init(struct exynos_acpm *exynos_acpm,
> +				      struct exynos_acpm_shmem_chan *shmem_chan,
> +				      struct exynos_acpm_queue *rx)
> +{
> +	void __iomem *base = exynos_acpm->sram_base;
> +
> +	rx->base = exynos_acpm_get_iomem_addr(base, &shmem_chan->tx_base);
> +	rx->front = exynos_acpm_get_iomem_addr(base, &shmem_chan->tx_front);
> +	rx->rear = exynos_acpm_get_iomem_addr(base, &shmem_chan->tx_rear);
> +}
> +
> +static void exynos_acpm_tx_queue_init(struct exynos_acpm *exynos_acpm,
> +				      struct exynos_acpm_shmem_chan *shmem_chan,
> +				      struct exynos_acpm_queue *tx)
> +{
> +	void __iomem *base = exynos_acpm->sram_base;
> +
> +	tx->base = exynos_acpm_get_iomem_addr(base, &shmem_chan->rx_base);
> +	tx->front = exynos_acpm_get_iomem_addr(base, &shmem_chan->rx_front);
> +	tx->rear = exynos_acpm_get_iomem_addr(base, &shmem_chan->rx_rear);
> +}
> +
> +static int exynos_acpm_alloc_cmds(struct exynos_acpm *exynos_acpm,
> +				  struct exynos_acpm_chan *chan)
> +{
> +	struct device *dev = exynos_acpm->dev;
> +	struct exynos_acpm_rx_data *rx_data;
> +	unsigned int mlen = chan->mlen;
> +	int i;
> +
> +	for (i = 0; i < EXYNOS_ACPM_SEQNUM_MAX; i++) {
> +		rx_data = &chan->rx_data[i];
> +		rx_data->cmd = devm_kcalloc(dev, mlen, sizeof(*(rx_data->cmd)),
> +					    GFP_KERNEL);
> +		if (!rx_data->cmd)
> +			return -ENOMEM;
> +	}
> +
> +	return 0;
> +}
> +
> +static int exynos_acpm_chans_init(struct exynos_acpm *exynos_acpm)
> +{
> +	struct exynos_acpm_shmem_chan *shmem_chans, *shmem_chan;
> +	struct exynos_acpm_shmem *shmem = exynos_acpm->shmem;
> +	struct mbox_chan *mbox_chan, *mbox_chans;
> +	struct exynos_acpm_chan *chan, *chans;
> +	struct device *dev = exynos_acpm->dev;
> +	int i, ret;
> +
> +	exynos_acpm->num_chans = readl_relaxed(&shmem->num_chans);
> +
> +	mbox_chans = devm_kcalloc(dev, exynos_acpm->num_chans,
> +				  sizeof(*mbox_chans), GFP_KERNEL);
> +	if (!mbox_chans)
> +		return -ENOMEM;
> +
> +	chans = devm_kcalloc(dev, exynos_acpm->num_chans, sizeof(*chans),
> +			     GFP_KERNEL);
> +	if (!chans)
> +		return -ENOMEM;
> +
> +	shmem_chans = exynos_acpm_get_iomem_addr(exynos_acpm->sram_base,
> +						 &shmem->chans);
> +
> +	for (i = 0; i < exynos_acpm->num_chans; i++) {
> +		shmem_chan = &shmem_chans[i];
> +		mbox_chan = &mbox_chans[i];
> +		chan = &chans[i];
> +
> +		/* Set parameters for the mailbox channel. */
> +		mbox_chan->con_priv = chan;
> +		mbox_chan->mbox = exynos_acpm->mbox;
> +
> +		/* Set parameters for the ACPM channel. */
> +		chan->mlen = readl_relaxed(&shmem_chan->mlen);
> +
> +		ret = exynos_acpm_alloc_cmds(exynos_acpm, chan);
> +		if (ret)
> +			return ret;
> +
> +		chan->polling = readl_relaxed(&shmem_chan->polling);
> +		chan->id = readl_relaxed(&shmem_chan->id);
> +		chan->qlen = readl_relaxed(&shmem_chan->qlen);
> +
> +		exynos_acpm_rx_queue_init(exynos_acpm, shmem_chan, &chan->rx);
> +		exynos_acpm_tx_queue_init(exynos_acpm, shmem_chan, &chan->tx);
> +
> +		mutex_init(&chan->rx_lock);
> +		spin_lock_init(&chan->tx_lock);
> +
> +		dev_vdbg(dev, "ID = %d poll = %d, mlen = %d, qlen = %d\n",
> +			 chan->id, chan->polling, chan->mlen, chan->qlen);
> +	}
> +
> +	/* Save pointers to the ACPM and mailbox channels. */
> +	exynos_acpm->mbox->chans = mbox_chans;
> +	exynos_acpm->chans = chans;
> +
> +	return 0;
> +}
> +
> +static const struct of_device_id exynos_acpm_match[] = {
> +	{ .compatible = "google,gs101-acpm" },

Where are the bindings?

> +	{},
> +};
> +MODULE_DEVICE_TABLE(of, exynos_acpm_match);
> +
> +static int exynos_acpm_probe(struct platform_device *pdev)
> +{
> +	struct device_node *node = pdev->dev.of_node;
> +	struct device *dev = &pdev->dev;
> +	struct exynos_acpm *exynos_acpm;
> +	struct mbox_controller *mbox;
> +	struct device_node *shmem;
> +	resource_size_t size;
> +	struct resource res;
> +	const __be32 *prop;
> +	int ret;
> +
> +	exynos_acpm = devm_kzalloc(dev, sizeof(*exynos_acpm), GFP_KERNEL);
> +	if (!exynos_acpm)
> +		return -ENOMEM;
> +
> +	mbox = devm_kzalloc(dev, sizeof(*mbox), GFP_KERNEL);
> +	if (!mbox)
> +		return -ENOMEM;
> +
> +	exynos_acpm->regs = devm_platform_ioremap_resource(pdev, 0);
> +	if (IS_ERR(exynos_acpm->regs))
> +		return PTR_ERR(exynos_acpm->regs);
> +
> +	shmem = of_parse_phandle(node, "shmem", 0);
> +	ret = of_address_to_resource(shmem, 0, &res);
> +	of_node_put(shmem);
> +	if (ret) {
> +		dev_err(dev, "Failed to get shared memory.\n");
> +		return ret;
> +	}
> +
> +	size = resource_size(&res);
> +	exynos_acpm->sram_base = devm_ioremap(dev, res.start, size);
> +	if (!exynos_acpm->sram_base) {
> +		dev_err(dev, "Failed to ioremap shared memory.\n");
> +		return -ENOMEM;
> +	}
> +
> +	prop = of_get_property(node, "initdata-base", NULL);
> +	if (!prop) {
> +		dev_err(dev, "Parsing initdata_base failed.\n");
> +		return -EINVAL;
> +	}
> +
> +	exynos_acpm->pclk = devm_clk_get(dev, "pclk");

devm_clk_get_enabled

> +	if (IS_ERR(exynos_acpm->pclk)) {
> +		dev_err(dev, "Missing peripheral clock.\n");

return dev_err_probe()

> +		return PTR_ERR(exynos_acpm->pclk);
> +	}
> +
> +	ret = clk_prepare_enable(exynos_acpm->pclk);
> +	if (ret) {
> +		dev_err(dev, "Failed to enable the peripheral clock.\n");
> +		return ret;
> +	}
> +
> +	exynos_acpm->wq = alloc_workqueue("exynos-acpm-wq", 0, 0);
> +	if (!exynos_acpm->wq)
> +		return -ENOMEM;
> +
> +	exynos_acpm->dev = dev;
> +	exynos_acpm->mbox = mbox;
> +	exynos_acpm->shmem = exynos_acpm->sram_base + be32_to_cpup(prop);
> +
> +	ret = exynos_acpm_chans_init(exynos_acpm);
> +	if (ret)
> +		return ret;
> +
> +	mbox->num_chans = exynos_acpm->num_chans;
> +	mbox->dev = dev;
> +	mbox->ops = &exynos_acpm_chan_ops;
> +
> +	platform_set_drvdata(pdev, exynos_acpm);
> +
> +	/* Mask out all interrupts. We support just polling channels for now. */
> +	writel_relaxed(EXYNOS_ACPM_INTMR0_MASK,
> +		       exynos_acpm->regs + EXYNOS_ACPM_INTMR0);
> +
> +	ret = devm_mbox_controller_register(dev, mbox);
> +	if (ret)
> +		dev_err(dev, "Failed to register mbox_controller(%d).\n", ret);
> +
> +	return ret;
> +}
> +
> +static void exynos_acpm_remove(struct platform_device *pdev)
> +{
> +	struct exynos_acpm *exynos_acpm = platform_get_drvdata(pdev);
> +
> +	flush_workqueue(exynos_acpm->wq);
> +	destroy_workqueue(exynos_acpm->wq);
> +	clk_disable_unprepare(exynos_acpm->pclk);
> +}
> +
> +static struct platform_driver exynos_acpm_driver = {
> +	.probe	= exynos_acpm_probe,
> +	.remove = exynos_acpm_remove,
> +	.driver	= {
> +		.name = "exynos-acpm",
> +		.owner	= THIS_MODULE,

Drop

> +		.of_match_table	= exynos_acpm_match,
> +	},
> +};
> +module_platform_driver(exynos_acpm_driver);

Best regards,
Krzysztof



^ permalink raw reply	[flat|nested] 25+ messages in thread

* Re: [PATCH v2 2/2] firmware: add exynos acpm driver
  2024-10-21 11:52   ` Krzysztof Kozlowski
@ 2024-10-21 14:12     ` Tudor Ambarus
  2024-10-22  4:38       ` Krzysztof Kozlowski
  0 siblings, 1 reply; 25+ messages in thread
From: Tudor Ambarus @ 2024-10-21 14:12 UTC (permalink / raw)
  To: Krzysztof Kozlowski, jassisinghbrar
  Cc: alim.akhtar, mst, javierm, tzimmermann, bartosz.golaszewski,
	luzmaximilian, sudeep.holla, conor.dooley, bjorn, ulf.hansson,
	linux-samsung-soc, linux-kernel, linux-arm-kernel, marcan, neal,
	alyssa, broonie, andre.draszik, willmcvicker, peter.griffin,
	kernel-team, vincent.guittot, daniel.lezcano

Hi, Krzysztof,

On 10/21/24 12:52 PM, Krzysztof Kozlowski wrote:
> On 17/10/2024 18:36, Tudor Ambarus wrote:
>> ACPM (Alive Clock and Power Manager) is a firmware that operates on the
>> APM (Active Power Management) module that handles overall power management
>> activities. ACPM and masters regard each other as independent
>> hardware component and communicate with each other using mailbox messages
>> and shared memory.
>>
>> The mailbox channels are initialized based on the configuration data
>> found at a specific offset into the shared memory (mmio-sram). The
>> configuration data consists of channel id, message and queue lengths,
>> pointers to the RX and TX queues which are also part of the SRAM, and
>> whether RX works by polling or interrupts. All known clients of this
>> driver are using polling channels, thus the driver implements for now
>> just polling mode.
>>
>> Add support for the exynos acpm core driver. Helper drivers will follow.
>> These will construct the mailbox messages in the format expected by the
>> firmware.
> 
> I skimmed through the driver and I do not understand why this is
> firmware. You are implementing a mailbox provider/controller.

In my case the mailbox hardware is used just to raise the interrupt to
the other side. Then there's the SRAM which contains the channels
configuration data and the TX/RX queues. The enqueue/deque is done
in/from SRAM. This resembles a lot with drivers/firmware/arm_scmi/, see:

drivers/firmware/arm_scmi/shmem.c
drivers/firmware/arm_scmi/transports/mailbox.c

After the SRAM and mailbox/transport code I'll come up with two helper
drivers that construct the mailbox messages in the format expected by
the firmware. There are 2 types of messages recognized by the ACPM
firmware: PMIC and DVFS. The client drivers will use these helper
drivers to prepare a specific message. Then they will use the mailbox
core to send the message and they'll wait for the answer.

This layered structure and the use of SRAM resembles with arm_scmi and
made me think that the ACPM driver it's better suited for
drivers/firmware. I'm opened for suggestions though.

> 
> I did not perform full review yet, just skimmed over the code. I will
> take a look a bit later.
> 

No worries, thanks for doing it. I agree with all the suggestions from
below and I'll address them after we get an agreement with Jassi on how
to extend the mailbox core.

More answers below.

>>
>> Signed-off-by: Tudor Ambarus <tudor.ambarus@linaro.org>
>> ---
>>  drivers/firmware/Kconfig                    |   1 +
>>  drivers/firmware/Makefile                   |   1 +
>>  drivers/firmware/samsung/Kconfig            |  11 +
>>  drivers/firmware/samsung/Makefile           |   3 +
>>  drivers/firmware/samsung/exynos-acpm.c      | 703 ++++++++++++++++++++
> 
> Please add directory to the Samsung Exynos SoC maintainer entry. I also
> encourage adding separate entry for the driver where you would be listed
> as maintainer.

ok

> 
> There is no firmware tree, so this will be going via Samsung SoC.

I noticed afterwards, thanks.

> 
>>  include/linux/mailbox/exynos-acpm-message.h |  21 +
>>  6 files changed, 740 insertions(+)
>>  create mode 100644 drivers/firmware/samsung/Kconfig
>>  create mode 100644 drivers/firmware/samsung/Makefile
>>  create mode 100644 drivers/firmware/samsung/exynos-acpm.c
>>  create mode 100644 include/linux/mailbox/exynos-acpm-message.h
>>
>> diff --git a/drivers/firmware/Kconfig b/drivers/firmware/Kconfig
>> index 71d8b26c4103..24edb956831b 100644
>> --- a/drivers/firmware/Kconfig
>> +++ b/drivers/firmware/Kconfig
>> @@ -267,6 +267,7 @@ source "drivers/firmware/meson/Kconfig"
>>  source "drivers/firmware/microchip/Kconfig"
>>  source "drivers/firmware/psci/Kconfig"
>>  source "drivers/firmware/qcom/Kconfig"
>> +source "drivers/firmware/samsung/Kconfig"
>>  source "drivers/firmware/smccc/Kconfig"
>>  source "drivers/firmware/tegra/Kconfig"
>>  source "drivers/firmware/xilinx/Kconfig"
>> diff --git a/drivers/firmware/Makefile b/drivers/firmware/Makefile
>> index 7a8d486e718f..91efcc868a05 100644
>> --- a/drivers/firmware/Makefile
>> +++ b/drivers/firmware/Makefile
>> @@ -33,6 +33,7 @@ obj-y				+= efi/
>>  obj-y				+= imx/
>>  obj-y				+= psci/
>>  obj-y				+= qcom/
>> +obj-y				+= samsung/
>>  obj-y				+= smccc/
>>  obj-y				+= tegra/
>>  obj-y				+= xilinx/
>> diff --git a/drivers/firmware/samsung/Kconfig b/drivers/firmware/samsung/Kconfig
>> new file mode 100644
>> index 000000000000..f908773c1441
>> --- /dev/null
>> +++ b/drivers/firmware/samsung/Kconfig
>> @@ -0,0 +1,11 @@
>> +# SPDX-License-Identifier: GPL-2.0-only
>> +
>> +config EXYNOS_ACPM
>> +	tristate "Exynos ACPM (Alive Clock and Power Manager) driver support"
> 
> depends ARCH_EXYNOS || COMPILE_TEST

oh yes.
> 
> Please also send a arm64 defconfig change making it a module.

will do

cut

>> +
>> +/**
>> + * struct exynos_acpm_shmem_chan - descriptor of a shared memory channel.
>> + *
>> + * @id:			channel ID.
>> + * @reserved:		reserved for future use.
>> + * @rx_rear:		rear pointer of RX queue.
>> + * @rx_front:		front pointer of RX queue.
>> + * @rx_base:		base address of RX queue.
>> + * @reserved1:		reserved for future use.
>> + * @tx_rear:		rear pointer of TX queue.
>> + * @tx_front:		front pointer of TX queue.
>> + * @tx_base:		base address of TX queue.
>> + * @qlen:		queue length. Applies to both TX/RX queues.
>> + * @mlen:		message length. Applies to both TX/RX queues.
>> + * @reserved2:		reserved for future use.
>> + * @polling:		true when the channel works on polling.
>> + */
>> +struct exynos_acpm_shmem_chan {
>> +	u32 id;
>> +	u32 reserved[3];
>> +	u32 rx_rear;
>> +	u32 rx_front;
>> +	u32 rx_base;
>> +	u32 reserved1[3];
>> +	u32 tx_rear;
>> +	u32 tx_front;
>> +	u32 tx_base;
>> +	u32 qlen;
>> +	u32 mlen;
>> +	u32 reserved2[2];
>> +	u32 polling;
> 
> Why are you storing addresses as u32? Shouldn't these be __iomem*?

This structure defines the offsets in SRAM that describe the channel
parameters. Instances of this struct shall be declared indeed as:
	struct exynos_acpm_shmem_chan __iomem *shmem_chan;
I missed that in v2, but will update in v2.

> 
> I also cannot find any piece of code setting several of above, e.g. tx_base

I'm not writing any SRAM configuration fields, these fields are used to
read/retrive the channel parameters from SRAM.

cut

>> +
>> +	spin_lock_irqsave(&chan->tx_lock, flags);
>> +
>> +	tx_front = readl_relaxed(chan->tx.front);
>> +	idx = (tx_front + 1) % chan->qlen;
>> +
>> +	ret = exynos_acpm_wait_for_queue_slots(mbox_chan, idx);
>> +	if (ret)
>> +		goto exit;
>> +
>> +	exynos_acpm_prepare_request(mbox_chan, req);
>> +
>> +	/* Write TX command. */
>> +	__iowrite32_copy(chan->tx.base + chan->mlen * tx_front, tx->cmd,
>> +			 req->txlen / 4);
>> +
>> +	/* Advance TX front. */
>> +	writel_relaxed(idx, chan->tx.front);
>> +
>> +	/* Flush SRAM posted writes. */
>> +	readl_relaxed(chan->tx.front);
>> +
> 
> How does this flush work? Maybe you just miss here barries (s/relaxed//)?

I think _relaxed() accessors should be fine in this driver as there are
no DMA accesses involved. _relaxed() accessors are fully ordered for
accesses to the same device/endpoint.

I'm worried however about the posted writes, the buses the devices sit
on may themselves have asynchronicity. So I issue a read from the same
device to ensure that the write has occured.

There is something that I haven't dimistified though. You'll notice that
the writes from above are on SRAM. I enqueue on the TX queue then
advance the head/front of the queue and then I read back to make sure
that the writes occured. Below I write to the controller's interrupt
register (different device/endpoint) to raise the interrupt for the
counterpart and inform that TX queue advanced. I'm not sure whether I
need a barrier here to make sure that the CPU does not reorder the
accesses and raise the interrupt before advancing the TX queue. If
someone already knows the answer, please say, I'll do some more reading
in the meantime.

> 
>> +	/* Generate ACPM interrupt. */
>> +	writel_relaxed(BIT(chan->id), exynos_acpm->regs + EXYNOS_ACPM_INTGR1);
>> +
>> +	/* Flush mailbox controller posted writes. */
>> +	readl_relaxed(exynos_acpm->regs + EXYNOS_ACPM_MCUCTRL);
>> +
>> +	spin_unlock_irqrestore(&chan->tx_lock, flags);
>> +
>> +	queue_work(exynos_acpm->wq, &work_data->work);
>> +
>> +	return -EINPROGRESS;
>> +exit:
>> +	spin_unlock_irqrestore(&chan->tx_lock, flags);
>> +	kfree(work_data);
>> +	return ret;
>> +}

cut

>> +static const struct of_device_id exynos_acpm_match[] = {
>> +	{ .compatible = "google,gs101-acpm" },
> 
> Where are the bindings?

will follow soon.

cut

>> +static int exynos_acpm_probe(struct platform_device *pdev)
>> +{

cut

>> +	exynos_acpm->pclk = devm_clk_get(dev, "pclk");
> 
> devm_clk_get_enabled

ok

> 
>> +	if (IS_ERR(exynos_acpm->pclk)) {
>> +		dev_err(dev, "Missing peripheral clock.\n");
> 
> return dev_err_probe()

ok

cut
>> +		.owner	= THIS_MODULE,
> 
> Drop

oh yes, thanks!

ta


^ permalink raw reply	[flat|nested] 25+ messages in thread

* Re: [PATCH v2 2/2] firmware: add exynos acpm driver
  2024-10-17 16:36 ` [PATCH v2 2/2] firmware: add exynos acpm driver Tudor Ambarus
  2024-10-21 11:52   ` Krzysztof Kozlowski
@ 2024-10-21 14:52   ` Tudor Ambarus
  2024-10-21 16:47   ` Alim Akhtar
  2 siblings, 0 replies; 25+ messages in thread
From: Tudor Ambarus @ 2024-10-21 14:52 UTC (permalink / raw)
  To: krzk
  Cc: alim.akhtar, mst, javierm, tzimmermann, bartosz.golaszewski,
	luzmaximilian, sudeep.holla, conor.dooley, bjorn, ulf.hansson,
	linux-samsung-soc, linux-kernel, linux-arm-kernel, marcan, neal,
	alyssa, broonie, andre.draszik, willmcvicker, peter.griffin,
	kernel-team, vincent.guittot, daniel.lezcano, Jassi Brar



On 10/17/24 5:36 PM, Tudor Ambarus wrote:
> +static int exynos_acpm_chans_init(struct exynos_acpm *exynos_acpm)
> +{
> +	struct exynos_acpm_shmem_chan *shmem_chans, *shmem_chan;
> +	struct exynos_acpm_shmem *shmem = exynos_acpm->shmem;

As Krzysztof has already noticed, I need to use the __iomem pointer
token when referring to shmem structures. This could be caught with
sparse as well, lesson learnt:

   drivers/firmware/samsung/exynos-acpm.c:493:54: sparse: sparse:
incorrect type in argument 2 (different address spaces) @@     expected
void [noderef] __iomem *addr @@     got unsigned int * @@


The following changes will be needed for v3:

--- a/drivers/firmware/samsung/exynos-acpm.c
+++ b/drivers/firmware/samsung/exynos-acpm.c
@@ -161,7 +161,7 @@ struct exynos_acpm_chan {
  * @num_chans: number of channels available for this controller.
  */
 struct exynos_acpm {
-       struct exynos_acpm_shmem *shmem;
+       struct exynos_acpm_shmem __iomem *shmem;
        struct exynos_acpm_chan *chans;
        void __iomem *sram_base;
        void __iomem *regs;
@@ -485,8 +485,8 @@ static void __iomem *exynos_acpm_get_iomem_addr(void
__iomem *base,
 }

 static void exynos_acpm_rx_queue_init(struct exynos_acpm *exynos_acpm,
-                                     struct exynos_acpm_shmem_chan
*shmem_chan,
-                                     struct exynos_acpm_queue *rx)
+               struct exynos_acpm_shmem_chan __iomem *shmem_chan,
+               struct exynos_acpm_queue *rx)
 {
        void __iomem *base = exynos_acpm->sram_base;

@@ -496,8 +496,8 @@ static void exynos_acpm_rx_queue_init(struct
exynos_acpm *exynos_acpm,
 }

 static void exynos_acpm_tx_queue_init(struct exynos_acpm *exynos_acpm,
-                                     struct exynos_acpm_shmem_chan
*shmem_chan,
-                                     struct exynos_acpm_queue *tx)
+               struct exynos_acpm_shmem_chan __iomem *shmem_chan,
+               struct exynos_acpm_queue *tx)
 {
        void __iomem *base = exynos_acpm->sram_base;

@@ -527,8 +527,8 @@ static int exynos_acpm_alloc_cmds(struct exynos_acpm
*exynos_acpm,

 static int exynos_acpm_chans_init(struct exynos_acpm *exynos_acpm)
 {
-       struct exynos_acpm_shmem_chan *shmem_chans, *shmem_chan;
-       struct exynos_acpm_shmem *shmem = exynos_acpm->shmem;
+       struct exynos_acpm_shmem_chan __iomem *shmem_chans, *shmem_chan;
+       struct exynos_acpm_shmem __iomem *shmem = exynos_acpm->shmem;
        struct mbox_chan *mbox_chan, *mbox_chans;
        struct exynos_acpm_chan *chan, *chans;
        struct device *dev = exynos_acpm->dev;


^ permalink raw reply	[flat|nested] 25+ messages in thread

* Re: [PATCH v2 1/2] mailbox: add async request mechanism to empower controllers w/ hw queues
  2024-10-21  6:18       ` Tudor Ambarus
@ 2024-10-21 16:32         ` Jassi Brar
  2024-10-22 13:26           ` Tudor Ambarus
  0 siblings, 1 reply; 25+ messages in thread
From: Jassi Brar @ 2024-10-21 16:32 UTC (permalink / raw)
  To: Tudor Ambarus
  Cc: krzk, alim.akhtar, mst, javierm, tzimmermann, bartosz.golaszewski,
	luzmaximilian, sudeep.holla, conor.dooley, bjorn, ulf.hansson,
	linux-samsung-soc, linux-kernel, linux-arm-kernel, marcan, neal,
	alyssa, broonie, andre.draszik, willmcvicker, peter.griffin,
	kernel-team, vincent.guittot, daniel.lezcano

> On 10/18/24 8:49 AM, Tudor Ambarus wrote:

> The active request is considered completed when TX completes. But it seems
> that TX is not in direct relation with RX,

TX and RX are assumed independent operations (which they are).
TX is sending a message/request to the remote successfully. 'Success'
can be indicated by any of the three methods  TXDONE_BY_{POLLING, IRQ,
ACK}.
You seem to assume it should always be an ACK where we receive an
acknowledgment/response packet, which is not the case.

...

>> Correct, and it is not meant to be.
>> You are assuming there is always an RX in response to a TX, which is

> Not really. If there's no response expected, clients can set req->rx to NULL.

You are assuming the default behaviour is that there is a reply
associated with a TX, otherwise "clients can set req->rx to NULL".
This api can be _used_ only for protocols that expect a response for
each TX. For other protocols,  it is simply a passthrough wrapper (by
doing things like req->rx = NULL). For handling this special case of
Tx->Rx, maybe a higher level common helper function would be better.

...

>> reasons, it is left to the user to tie an incoming RX to some previous
>> TX, or not.

> Is there a specific driver that I can look at in order to understand the
> case where RX is not tied to TX?

 Many...
 * The remote end owns sensors and can asynchronously send, say
thermal, notifications to Linux.
 * On some platform the RX may be asynchronous, that is sent later
with some identifying tag of the past TX.
 * Just reverse the roles of local and remote - remote sends us a
request (RX for us) and we are supposed to send a response (TX).

> Also, if you think there's a better way to enable controllers to manage
> their hardware queues, please say.
>
Tying RX to TX has nothing to do with hardware queues. There can be a
hardware queue and the protocol can still be
"local-to-remote-broadcast".

While I don't think we need the "Rx is in relation to some past Tx"
api, I am open to ideas to better utilize h/w queues.

The h/w TX queue/fifo may hold, say, 8 messages while the controller
transmits them to the remote. Currently it is implemented by
.last_tx_done() returning true as long as fifo is not full.
The other option, to have more than one TX in-flight, only complicates
the mailbox api without fetching any real benefits because the
protocol would still need to handle cases like Tx was successful but
the remote rejected the request or the Rx failed on the h/w fifo
(there could be rx-fifo too, right). Is where I am at right now.

Regards,
Jassi

^ permalink raw reply	[flat|nested] 25+ messages in thread

* RE: [PATCH v2 2/2] firmware: add exynos acpm driver
  2024-10-17 16:36 ` [PATCH v2 2/2] firmware: add exynos acpm driver Tudor Ambarus
  2024-10-21 11:52   ` Krzysztof Kozlowski
  2024-10-21 14:52   ` Tudor Ambarus
@ 2024-10-21 16:47   ` Alim Akhtar
  2024-10-25  9:44     ` Tudor Ambarus
  2 siblings, 1 reply; 25+ messages in thread
From: Alim Akhtar @ 2024-10-21 16:47 UTC (permalink / raw)
  To: 'Tudor Ambarus', jassisinghbrar, krzk
  Cc: mst, javierm, tzimmermann, bartosz.golaszewski, luzmaximilian,
	sudeep.holla, conor.dooley, bjorn, ulf.hansson, linux-samsung-soc,
	linux-kernel, linux-arm-kernel, marcan, neal, alyssa, broonie,
	andre.draszik, willmcvicker, peter.griffin, kernel-team,
	vincent.guittot, daniel.lezcano

Hi Tudor

> -----Original Message-----
> From: Tudor Ambarus <tudor.ambarus@linaro.org>
> Sent: Thursday, October 17, 2024 10:07 PM
> To: jassisinghbrar@gmail.com; krzk@kernel.org
> Cc: alim.akhtar@samsung.com; mst@redhat.com; javierm@redhat.com;
> tzimmermann@suse.de; bartosz.golaszewski@linaro.org;
> luzmaximilian@gmail.com; sudeep.holla@arm.com;
> conor.dooley@microchip.com; bjorn@rivosinc.com; ulf.hansson@linaro.org;
> linux-samsung-soc@vger.kernel.org; linux-kernel@vger.kernel.org; linux-
> arm-kernel@lists.infradead.org; marcan@marcan.st; neal@gompa.dev;
> alyssa@rosenzweig.io; broonie@kernel.org; andre.draszik@linaro.org;
> willmcvicker@google.com; peter.griffin@linaro.org; kernel-
> team@android.com; vincent.guittot@linaro.org; daniel.lezcano@linaro.org;
> Tudor Ambarus <tudor.ambarus@linaro.org>
> Subject: [PATCH v2 2/2] firmware: add exynos acpm driver
> 
> ACPM (Alive Clock and Power Manager) is a firmware that operates on the
> APM (Active Power Management) module that handles overall power
> management activities. ACPM and masters regard each other as
> independent hardware component and communicate with each other using
> mailbox messages and shared memory.
> 
> The mailbox channels are initialized based on the configuration data found
at
> a specific offset into the shared memory (mmio-sram). The configuration
> data consists of channel id, message and queue lengths, pointers to the RX
> and TX queues which are also part of the SRAM, and whether RX works by
> polling or interrupts. All known clients of this driver are using polling
channels,
> thus the driver implements for now just polling mode.
> 
> Add support for the exynos acpm core driver. Helper drivers will follow.
> These will construct the mailbox messages in the format expected by the
> firmware.
> 
> Signed-off-by: Tudor Ambarus <tudor.ambarus@linaro.org>
> ---
>  drivers/firmware/Kconfig                    |   1 +
>  drivers/firmware/Makefile                   |   1 +
>  drivers/firmware/samsung/Kconfig            |  11 +
>  drivers/firmware/samsung/Makefile           |   3 +
>  drivers/firmware/samsung/exynos-acpm.c      | 703
> ++++++++++++++++++++
>  include/linux/mailbox/exynos-acpm-message.h |  21 +
>  6 files changed, 740 insertions(+)
>  create mode 100644 drivers/firmware/samsung/Kconfig  create mode
> 100644 drivers/firmware/samsung/Makefile  create mode 100644
> drivers/firmware/samsung/exynos-acpm.c
>  create mode 100644 include/linux/mailbox/exynos-acpm-message.h
> 
> diff --git a/drivers/firmware/Kconfig b/drivers/firmware/Kconfig index
> 71d8b26c4103..24edb956831b 100644
> --- a/drivers/firmware/Kconfig
> +++ b/drivers/firmware/Kconfig
> @@ -267,6 +267,7 @@ source "drivers/firmware/meson/Kconfig"
>  source "drivers/firmware/microchip/Kconfig"
>  source "drivers/firmware/psci/Kconfig"
>  source "drivers/firmware/qcom/Kconfig"
> +source "drivers/firmware/samsung/Kconfig"
>  source "drivers/firmware/smccc/Kconfig"
>  source "drivers/firmware/tegra/Kconfig"
>  source "drivers/firmware/xilinx/Kconfig"
> diff --git a/drivers/firmware/Makefile b/drivers/firmware/Makefile index
> 7a8d486e718f..91efcc868a05 100644
> --- a/drivers/firmware/Makefile
> +++ b/drivers/firmware/Makefile
> @@ -33,6 +33,7 @@ obj-y				+= efi/
>  obj-y				+= imx/
>  obj-y				+= psci/
>  obj-y				+= qcom/
> +obj-y				+= samsung/
>  obj-y				+= smccc/
>  obj-y				+= tegra/
>  obj-y				+= xilinx/
> diff --git a/drivers/firmware/samsung/Kconfig
> b/drivers/firmware/samsung/Kconfig
> new file mode 100644
> index 000000000000..f908773c1441
> --- /dev/null
> +++ b/drivers/firmware/samsung/Kconfig
> @@ -0,0 +1,11 @@
> +# SPDX-License-Identifier: GPL-2.0-only
> +
> +config EXYNOS_ACPM

This looks misleading to me, as you mentioned below, ACPM is a FW which runs
on APM module, and 
The proposed driver is a communication method between Application processor
and APM module,
Which is via MAILBOX.
So preferably EXYNOS_MAILBOX_APM is more meaningful here.

> +	tristate "Exynos ACPM (Alive Clock and Power Manager) driver
> support"
> +	select MAILBOX
> +	help
> +	  ACPM is a firmware that operates on the APM (Active Power
> Management)
> +	  module that handles overall power management activities. ACPM
> and
> +	  masters regard each other as independent hardware component
> and
> +	  communicate with each other using mailbox messages and shared
> memory.
> +	  This module provides the means to communicate with the ACPM
> firmware.
> diff --git a/drivers/firmware/samsung/Makefile
> b/drivers/firmware/samsung/Makefile
> new file mode 100644
> index 000000000000..35ff3076bbea
> --- /dev/null
> +++ b/drivers/firmware/samsung/Makefile
> @@ -0,0 +1,3 @@
> +# SPDX-License-Identifier: GPL-2.0-only
> +
> +obj-$(CONFIG_EXYNOS_ACPM)	+= exynos-acpm.o
> diff --git a/drivers/firmware/samsung/exynos-acpm.c
> b/drivers/firmware/samsung/exynos-acpm.c
> new file mode 100644
> index 000000000000..c3ad4dc7a9e0
> --- /dev/null
> +++ b/drivers/firmware/samsung/exynos-acpm.c
> @@ -0,0 +1,703 @@
> +// SPDX-License-Identifier: GPL-2.0-only
> +/*
> + * Copyright 2020 Samsung Electronics Co., Ltd.
> + * Copyright 2020 Google LLC.
> + * Copyright 2024 Linaro Ltd.
> + */
> +
> +#include <linux/bitfield.h>
> +#include <linux/bitmap.h>
> +#include <linux/bits.h>
> +#include <linux/container_of.h>
> +#include <linux/clk.h>
> +#include <linux/delay.h>
> +#include <linux/io.h>
> +#include <linux/iopoll.h>
> +#include <linux/mailbox_controller.h>
> +#include <linux/mailbox/exynos-acpm-message.h>
> +#include <linux/module.h>
> +#include <linux/mutex.h>
> +#include <linux/of.h>
> +#include <linux/of_address.h>
> +#include <linux/platform_device.h>
> +#include <linux/slab.h>
> +
> +#define EXYNOS_ACPM_MCUCTRL		0x0	/* Mailbox Control
> Register */
> +#define EXYNOS_ACPM_INTCR0		0x24	/* Interrupt Clear
> Register 0 */
> +#define EXYNOS_ACPM_INTMR0		0x28	/* Interrupt Mask
> Register 0 */
> +#define EXYNOS_ACPM_INTSR0		0x2c	/* Interrupt Status
> Register 0 */
> +#define EXYNOS_ACPM_INTMSR0		0x30	/* Interrupt Mask
> Status Register 0 */
> +#define EXYNOS_ACPM_INTGR1		0x40	/* Interrupt
> Generation Register 1 */
> +#define EXYNOS_ACPM_INTMR1		0x48	/* Interrupt Mask
> Register 1 */
> +#define EXYNOS_ACPM_INTSR1		0x4c	/* Interrupt Status
> Register 1 */
> +#define EXYNOS_ACPM_INTMSR1		0x50	/* Interrupt Mask
> Status Register 1 */
> +
> +#define EXYNOS_ACPM_INTMR0_MASK		GENMASK(15, 0)
> +#define EXYNOS_ACPM_PROTOCOL_SEQNUM	GENMASK(21, 16)
> +
> +/* The unit of counter is 20 us. 5000 * 20 = 100 ms */
> +#define EXYNOS_ACPM_POLL_TIMEOUT	5000
> +#define EXYNOS_ACPM_TX_TIMEOUT_US	500000
> +
> +/**
> + * struct exynos_acpm_shmem - mailbox shared memory configuration
> information.
> + * @reserved:	reserved for future use.
> + * @chans:	offset to array of struct exynos_acpm_shmem_chan.
> + * @reserved1:	reserved for future use.
> + * @num_chans:	number of channels.
> + */
> +struct exynos_acpm_shmem {
> +	u32 reserved[2];
> +	u32 chans;
> +	u32 reserved1[3];
> +	u32 num_chans;
> +};
> +
> +/**
> + * struct exynos_acpm_shmem_chan - descriptor of a shared memory
> channel.
> + *
> + * @id:			channel ID.
> + * @reserved:		reserved for future use.
> + * @rx_rear:		rear pointer of RX queue.
> + * @rx_front:		front pointer of RX queue.
> + * @rx_base:		base address of RX queue.
> + * @reserved1:		reserved for future use.
> + * @tx_rear:		rear pointer of TX queue.
> + * @tx_front:		front pointer of TX queue.
> + * @tx_base:		base address of TX queue.
> + * @qlen:		queue length. Applies to both TX/RX queues.
> + * @mlen:		message length. Applies to both TX/RX queues.
> + * @reserved2:		reserved for future use.
> + * @polling:		true when the channel works on polling.
> + */
> +struct exynos_acpm_shmem_chan {
> +	u32 id;
> +	u32 reserved[3];
> +	u32 rx_rear;
> +	u32 rx_front;
> +	u32 rx_base;
> +	u32 reserved1[3];
> +	u32 tx_rear;
> +	u32 tx_front;
> +	u32 tx_base;
> +	u32 qlen;
> +	u32 mlen;
> +	u32 reserved2[2];
> +	u32 polling;
> +};
> +
> +/**
> + * struct exynos_acpm_queue - exynos acpm queue.
> + *
> + * @rear:	rear address of the queue.
> + * @front:	front address of the queue.
> + * @base:	base address of the queue.
> + */
> +struct exynos_acpm_queue {
> +	void __iomem *rear;
> +	void __iomem *front;
> +	void __iomem *base;
> +};
> +
> +/**
> + * struct exynos_acpm_rx_data - RX queue data.
> + *
> + * @cmd:	pointer to where the data shall be saved.
> + * @response:	true if the client expects the RX data.
> + */
> +struct exynos_acpm_rx_data {
> +	u32 *cmd;
> +	bool response;
> +};
> +
> +#define EXYNOS_ACPM_SEQNUM_MAX    64
> +
> +/**
> + * struct exynos_acpm_chan - driver internal representation of a channel.
> + * @tx:		TX queue. The enqueue is done by the host.
> + *			- front index is written by the host.
> + *			- rear index is written by the firmware.
> + *
> + * @rx:		RX queue. The enqueue is done by the firmware.
> + *			- front index is written by the firmware.
> + *			- rear index is written by the host.
> + * @rx_lock:	protects RX queue. The RX queue is accessed just in
> + *		process context.
> + * @tx_lock:	protects TX queue.
> + * @qlen:	queue length. Applies to both TX/RX queues.
> + * @mlen:	message length. Applies to both TX/RX queues.
> + * @seqnum:	sequence number of the last message enqueued on TX
> queue.
> + * @id:		channel ID.
> + * @polling:	true when the channel works on polling.
> + * @bitmap_seqnum: bitmap that tracks the messages on the TX/RX
> queues.
> + * @rx_data:	internal buffer used to drain the RX queue.
> + */
> +struct exynos_acpm_chan {
> +	struct exynos_acpm_queue tx;
> +	struct exynos_acpm_queue rx;
> +	struct mutex rx_lock;
> +	spinlock_t tx_lock;
> +
> +	unsigned int qlen;
> +	unsigned int mlen;
> +	u8 seqnum;
> +	u8 id;
> +	bool polling;
> +
> +	DECLARE_BITMAP(bitmap_seqnum, EXYNOS_ACPM_SEQNUM_MAX
> - 1);
> +	struct exynos_acpm_rx_data
> rx_data[EXYNOS_ACPM_SEQNUM_MAX]; };
> +
> +/**
> + * struct exynos_acpm - driver's private data.
> + * @shmem:	pointer to the SRAM configuration data.
> + * @chans:	pointer to the ACPM channel parameters retrieved from
> SRAM.
> + * @sram_base:	base address of SRAM.
> + * @regs:	mailbox registers base address.
> + * @mbox:	pointer to the mailbox controller.
> + * @wq:		pointer to workqueue.
> + * @dev:	pointer to the exynos-acpm device.
> + * @pclk:	pointer to the mailbox peripheral clock.
> + * @num_chans:	number of channels available for this controller.
> + */
> +struct exynos_acpm {
> +	struct exynos_acpm_shmem *shmem;
> +	struct exynos_acpm_chan *chans;
> +	void __iomem *sram_base;
> +	void __iomem *regs;
> +	struct mbox_controller *mbox;
> +	struct workqueue_struct *wq;
> +	struct device *dev;
> +	struct clk *pclk;
> +	u32 num_chans;
> +};
> +
> +/**
> + * struct exynos_acpm_work_data - data structure representing the work.
> + * @mbox_chan:	pointer to the mailbox channel.
> + * @req:	pointer to the mailbox request.
> + * @callback:	pointer to a callback function to be invoked upon
> + *		completion of this request.
> + * @work:	describes the task to be executed.
> + */
> +struct exynos_acpm_work_data {
> +	struct mbox_chan *mbox_chan;
> +	struct mbox_request *req;
> +	void (*callback)(struct exynos_acpm_work_data *work_data, int
> status);
> +	struct work_struct work;
> +};
> +
> +static int exynos_acpm_get_rx(struct mbox_chan *mbox_chan,
> +			      struct mbox_request *req)
> +{
> +	struct exynos_acpm_chan *chan = mbox_chan->con_priv;
> +	struct exynos_acpm_message *tx = req->tx;
> +	struct exynos_acpm_message *rx = req->rx;
> +	struct exynos_acpm_rx_data *rx_data;
> +	const void __iomem *base, *addr;
> +	u32 rx_front, rx_seqnum, tx_seqnum, seqnum;
> +	u32 i, val, mlen;
> +	bool rx_set = false;
> +
> +	rx_front = readl_relaxed(chan->rx.front);
> +	i = readl_relaxed(chan->rx.rear);
> +
> +	/* Bail out if RX is empty. */
> +	if (i == rx_front)
> +		return 0;
> +
> +	base = chan->rx.base;
> +	mlen = chan->mlen;
> +
> +	tx_seqnum = FIELD_GET(EXYNOS_ACPM_PROTOCOL_SEQNUM, tx-
> >cmd[0]);
> +
> +	/* Drain RX queue. */
> +	do {
> +		/* Read RX seqnum. */
> +		addr = base + mlen * i;
> +		val = readl_relaxed(addr);
> +
> +		rx_seqnum =
> FIELD_GET(EXYNOS_ACPM_PROTOCOL_SEQNUM, val);
> +		if (!rx_seqnum)
> +			return -EIO;
> +		/*
> +		 * mssg seqnum starts with value 1, whereas the driver
> considers
> +		 * the first mssg at index 0.
> +		 */
> +		seqnum = rx_seqnum - 1;
> +		rx_data = &chan->rx_data[seqnum];
> +
> +		if (rx_data->response) {
> +			if (rx_seqnum == tx_seqnum) {
> +				__ioread32_copy(rx->cmd, addr, req->rxlen /
> 4);
> +				rx_set = true;
> +				clear_bit(seqnum, chan->bitmap_seqnum);
> +			} else {
> +				/*
> +				 * The RX data corresponds to another
> request.
> +				 * Save the data to drain the queue, but
don't
> +				 * clear yet the bitmap. It will be cleared
> +				 * after the response is copied to the
request.
> +				 */
> +				__ioread32_copy(rx_data->cmd, addr,
> +						req->rxlen / 4);
> +			}
> +		} else {
> +			clear_bit(seqnum, chan->bitmap_seqnum);
> +		}
> +
> +		i = (i + 1) % chan->qlen;
> +	} while (i != rx_front);
> +
> +	/* We saved all responses, mark RX empty. */
> +	writel_relaxed(rx_front, chan->rx.rear);
> +
> +	/* Flush SRAM posted writes. */
> +	readl_relaxed(chan->rx.front);
> +
> +	/*
> +	 * If the response was not in this iteration of the queue, check if
the
> +	 * RX data was previously saved.
> +	 */
> +	rx_data = &chan->rx_data[tx_seqnum - 1];
> +	if (!rx_set && rx_data->response) {
> +		rx_seqnum =
> FIELD_GET(EXYNOS_ACPM_PROTOCOL_SEQNUM,
> +				      rx_data->cmd[0]);
> +
> +		if (rx_seqnum == tx_seqnum) {
> +			memcpy(rx->cmd, rx_data->cmd, req->rxlen);
> +			clear_bit(rx_seqnum - 1, chan->bitmap_seqnum);
> +		}
> +	}
> +
> +	return 0;
> +}
> +
> +static int exynos_acpm_dequeue_by_polling(struct mbox_chan
> *mbox_chan,
> +					  struct mbox_request *req)
> +{
> +	struct exynos_acpm_chan *chan = mbox_chan->con_priv;
> +	struct exynos_acpm_message *tx = req->tx;
> +	struct device *dev = mbox_chan->mbox->dev;
> +	unsigned int cnt_20us = 0;
> +	u32 seqnum;
> +	int ret;
> +
> +	seqnum = FIELD_GET(EXYNOS_ACPM_PROTOCOL_SEQNUM, tx-
> >cmd[0]);
> +
> +	do {
> +		ret = mutex_lock_interruptible(&chan->rx_lock);
> +		if (ret)
> +			return ret;
> +		ret = exynos_acpm_get_rx(mbox_chan, req);
> +		mutex_unlock(&chan->rx_lock);
> +		if (ret)
> +			return ret;
> +
> +		if (!test_bit(seqnum - 1, chan->bitmap_seqnum)) {
> +			dev_vdbg(dev, "cnt_20us = %d.\n", cnt_20us);
> +			return 0;
> +		}
> +
> +		/* Determined experimentally. */
> +		usleep_range(20, 30);
> +		cnt_20us++;
> +	} while (cnt_20us < EXYNOS_ACPM_POLL_TIMEOUT);
> +
> +	dev_err(dev, "Timeout! ch:%u s:%u bitmap:%lx, cnt_20us = %d.\n",
> +		chan->id, seqnum, chan->bitmap_seqnum[0], cnt_20us);
> +
> +	return -ETIME;
> +}
> +
> +static void exynos_acpm_done(struct exynos_acpm_work_data
> *work_data,
> +int status) {
> +	struct mbox_request *req = work_data->req;
> +
> +	kfree(work_data);
> +	mbox_request_complete(req, status);
> +}
> +
> +static void exynos_acpm_work_handler(struct work_struct *work) {
> +	struct exynos_acpm_work_data *work_data =
> +		container_of(work, struct exynos_acpm_work_data, work);
> +	struct mbox_chan *mbox_chan = work_data->mbox_chan;
> +	int ret;
> +
> +	ret = exynos_acpm_dequeue_by_polling(mbox_chan, work_data-
> >req);
> +	work_data->callback(work_data, ret);
> +}
> +
> +static struct exynos_acpm_work_data *
> +	exynos_acpm_init_work(struct mbox_chan *mbox_chan,
> +			      struct mbox_request *req)
> +{
> +	struct exynos_acpm_work_data *work_data;
> +	gfp_t gfp = (req->flags & MBOX_REQ_MAY_SLEEP) ? GFP_KERNEL :
> +GFP_ATOMIC;
> +
> +	work_data = kmalloc(sizeof(*work_data), gfp);
> +	if (!work_data)
> +		return ERR_PTR(-ENOMEM);
> +
> +	work_data->mbox_chan = mbox_chan;
> +	work_data->req = req;
> +	work_data->callback = exynos_acpm_done;
> +	INIT_WORK(&work_data->work, exynos_acpm_work_handler);
> +
> +	return work_data;
> +}
> +
> +static void exynos_acpm_prepare_request(struct mbox_chan *mbox_chan,
> +					struct mbox_request *req)
> +{
> +	struct exynos_acpm_chan *chan = mbox_chan->con_priv;
> +	struct exynos_acpm_message *tx = req->tx;
> +	struct exynos_acpm_rx_data *rx_data;
> +
> +	/* Prevent chan->seqnum from being re-used */
> +	do {
> +		if (++chan->seqnum == EXYNOS_ACPM_SEQNUM_MAX)
> +			chan->seqnum = 1;
> +	} while (test_bit(chan->seqnum - 1, chan->bitmap_seqnum));
> +
> +	tx->cmd[0] |= FIELD_PREP(EXYNOS_ACPM_PROTOCOL_SEQNUM,
> chan->seqnum);
> +
> +	/* Clear data for upcoming responses */
> +	rx_data = &chan->rx_data[chan->seqnum - 1];
> +	memset(rx_data->cmd, 0, sizeof(*(rx_data->cmd)) * chan->mlen);
> +	if (req->rx)
> +		rx_data->response = true;
> +
> +	/* Flag the index based on seqnum. (seqnum: 1~63, bitmap: 0~62) */
> +	set_bit(chan->seqnum - 1, chan->bitmap_seqnum); }
> +
> +static int exynos_acpm_wait_for_queue_slots(struct mbox_chan
> *mbox_chan,
> +					    u32 next_tx_front)
> +{
> +	struct exynos_acpm_chan *chan = mbox_chan->con_priv;
> +	struct device *dev = mbox_chan->mbox->dev;
> +	u32 val, ret;
> +
> +	/*
> +	 * Wait for RX front to keep up with TX front. Make sure there's at
> +	 * least one element between them.
> +	 */
> +	ret = readl_relaxed_poll_timeout_atomic(chan->rx.front, val,
> +						next_tx_front != val, 1,
> +
> 	EXYNOS_ACPM_TX_TIMEOUT_US);
> +	if (ret) {
> +		dev_err(dev, "RX front can not keep up with TX front.\n");
> +		return ret;
> +	}
> +
> +	ret = readl_relaxed_poll_timeout_atomic(chan->tx.rear, val,
> +						next_tx_front != val, 1,
> +
> 	EXYNOS_ACPM_TX_TIMEOUT_US);
> +	if (ret)
> +		dev_err(dev, "TX queue is full.\n");
> +
> +	return ret;
> +}
> +
> +static int exynos_acpm_send_request(struct mbox_chan *mbox_chan,
> +				    struct mbox_request *req)
> +{
> +	struct exynos_acpm *exynos_acpm = dev_get_drvdata(mbox_chan-
> >mbox->dev);
> +	struct exynos_acpm_chan *chan = mbox_chan->con_priv;
> +	struct exynos_acpm_message *tx = req->tx;
> +	struct exynos_acpm_work_data *work_data;
> +	u32 idx, tx_front;
> +	unsigned long flags;
> +	int ret;
> +
> +	if (!tx || !tx->cmd || req->txlen > chan->mlen ||
> +	    req->rxlen > chan->mlen)
> +		return -EINVAL;
> +
> +	work_data = exynos_acpm_init_work(mbox_chan, req);
> +	if (IS_ERR(work_data))
> +		return PTR_ERR(work_data);
> +
> +	spin_lock_irqsave(&chan->tx_lock, flags);
> +
> +	tx_front = readl_relaxed(chan->tx.front);
> +	idx = (tx_front + 1) % chan->qlen;
> +
> +	ret = exynos_acpm_wait_for_queue_slots(mbox_chan, idx);
> +	if (ret)
> +		goto exit;
> +
> +	exynos_acpm_prepare_request(mbox_chan, req);
> +
> +	/* Write TX command. */
> +	__iowrite32_copy(chan->tx.base + chan->mlen * tx_front, tx->cmd,
> +			 req->txlen / 4);
> +
> +	/* Advance TX front. */
> +	writel_relaxed(idx, chan->tx.front);
> +
> +	/* Flush SRAM posted writes. */
> +	readl_relaxed(chan->tx.front);
> +
> +	/* Generate ACPM interrupt. */
> +	writel_relaxed(BIT(chan->id), exynos_acpm->regs +
> EXYNOS_ACPM_INTGR1);
> +
> +	/* Flush mailbox controller posted writes. */
> +	readl_relaxed(exynos_acpm->regs + EXYNOS_ACPM_MCUCTRL);
> +
> +	spin_unlock_irqrestore(&chan->tx_lock, flags);
> +
> +	queue_work(exynos_acpm->wq, &work_data->work);
> +
> +	return -EINPROGRESS;
> +exit:
> +	spin_unlock_irqrestore(&chan->tx_lock, flags);
> +	kfree(work_data);
> +	return ret;
> +}
> +
> +static int exynos_acpm_chan_startup(struct mbox_chan *mbox_chan) {
> +	struct exynos_acpm_chan *chan = mbox_chan->con_priv;
> +
> +	if (!chan->polling) {
> +		dev_err(mbox_chan->mbox->dev, "IRQs not
> supported.\n");
> +		return -EINVAL;
> +	}
> +
> +	return 0;
> +}
> +
> +static const struct mbox_chan_ops exynos_acpm_chan_ops = {
> +	.send_request = exynos_acpm_send_request,
> +	.startup = exynos_acpm_chan_startup,
> +};
> +
> +static void __iomem *exynos_acpm_get_iomem_addr(void __iomem
> *base,
> +						void __iomem *addr)
> +{
> +	u32 offset;
> +
> +	offset = readl_relaxed(addr);
> +	return base + offset;
> +}
> +
> +static void exynos_acpm_rx_queue_init(struct exynos_acpm
> *exynos_acpm,
> +				      struct exynos_acpm_shmem_chan
> *shmem_chan,
> +				      struct exynos_acpm_queue *rx) {
> +	void __iomem *base = exynos_acpm->sram_base;
> +
> +	rx->base = exynos_acpm_get_iomem_addr(base, &shmem_chan-
> >tx_base);
> +	rx->front = exynos_acpm_get_iomem_addr(base, &shmem_chan-
> >tx_front);
> +	rx->rear = exynos_acpm_get_iomem_addr(base, &shmem_chan-
> >tx_rear); }
> +
> +static void exynos_acpm_tx_queue_init(struct exynos_acpm
> *exynos_acpm,
> +				      struct exynos_acpm_shmem_chan
> *shmem_chan,
> +				      struct exynos_acpm_queue *tx) {
> +	void __iomem *base = exynos_acpm->sram_base;
> +
> +	tx->base = exynos_acpm_get_iomem_addr(base, &shmem_chan-
> >rx_base);
> +	tx->front = exynos_acpm_get_iomem_addr(base, &shmem_chan-
> >rx_front);
> +	tx->rear = exynos_acpm_get_iomem_addr(base, &shmem_chan-
> >rx_rear); }
> +
> +static int exynos_acpm_alloc_cmds(struct exynos_acpm *exynos_acpm,
> +				  struct exynos_acpm_chan *chan)
> +{
> +	struct device *dev = exynos_acpm->dev;
> +	struct exynos_acpm_rx_data *rx_data;
> +	unsigned int mlen = chan->mlen;
> +	int i;
> +
> +	for (i = 0; i < EXYNOS_ACPM_SEQNUM_MAX; i++) {
> +		rx_data = &chan->rx_data[i];
> +		rx_data->cmd = devm_kcalloc(dev, mlen, sizeof(*(rx_data-
> >cmd)),
> +					    GFP_KERNEL);
> +		if (!rx_data->cmd)
> +			return -ENOMEM;
> +	}
> +
> +	return 0;
> +}
> +
> +static int exynos_acpm_chans_init(struct exynos_acpm *exynos_acpm) {
> +	struct exynos_acpm_shmem_chan *shmem_chans, *shmem_chan;
> +	struct exynos_acpm_shmem *shmem = exynos_acpm->shmem;
> +	struct mbox_chan *mbox_chan, *mbox_chans;
> +	struct exynos_acpm_chan *chan, *chans;
> +	struct device *dev = exynos_acpm->dev;
> +	int i, ret;
> +
> +	exynos_acpm->num_chans = readl_relaxed(&shmem->num_chans);
> +
> +	mbox_chans = devm_kcalloc(dev, exynos_acpm->num_chans,
> +				  sizeof(*mbox_chans), GFP_KERNEL);
> +	if (!mbox_chans)
> +		return -ENOMEM;
> +
> +	chans = devm_kcalloc(dev, exynos_acpm->num_chans,
> sizeof(*chans),
> +			     GFP_KERNEL);
> +	if (!chans)
> +		return -ENOMEM;
> +
> +	shmem_chans = exynos_acpm_get_iomem_addr(exynos_acpm-
> >sram_base,
> +						 &shmem->chans);
> +
> +	for (i = 0; i < exynos_acpm->num_chans; i++) {
> +		shmem_chan = &shmem_chans[i];
> +		mbox_chan = &mbox_chans[i];
> +		chan = &chans[i];
> +
> +		/* Set parameters for the mailbox channel. */
> +		mbox_chan->con_priv = chan;
> +		mbox_chan->mbox = exynos_acpm->mbox;
> +
> +		/* Set parameters for the ACPM channel. */
> +		chan->mlen = readl_relaxed(&shmem_chan->mlen);
> +
> +		ret = exynos_acpm_alloc_cmds(exynos_acpm, chan);
> +		if (ret)
> +			return ret;
> +
> +		chan->polling = readl_relaxed(&shmem_chan->polling);
> +		chan->id = readl_relaxed(&shmem_chan->id);
> +		chan->qlen = readl_relaxed(&shmem_chan->qlen);
> +
> +		exynos_acpm_rx_queue_init(exynos_acpm, shmem_chan,
> &chan->rx);
> +		exynos_acpm_tx_queue_init(exynos_acpm, shmem_chan,
> &chan->tx);
> +
> +		mutex_init(&chan->rx_lock);
> +		spin_lock_init(&chan->tx_lock);
> +
> +		dev_vdbg(dev, "ID = %d poll = %d, mlen = %d, qlen = %d\n",
> +			 chan->id, chan->polling, chan->mlen, chan->qlen);
> +	}
> +
> +	/* Save pointers to the ACPM and mailbox channels. */
> +	exynos_acpm->mbox->chans = mbox_chans;
> +	exynos_acpm->chans = chans;
> +
> +	return 0;
> +}
> +
> +static const struct of_device_id exynos_acpm_match[] = {
> +	{ .compatible = "google,gs101-acpm" },
> +	{},
> +};
> +MODULE_DEVICE_TABLE(of, exynos_acpm_match);
> +
> +static int exynos_acpm_probe(struct platform_device *pdev) {
> +	struct device_node *node = pdev->dev.of_node;
> +	struct device *dev = &pdev->dev;
> +	struct exynos_acpm *exynos_acpm;
> +	struct mbox_controller *mbox;
> +	struct device_node *shmem;
> +	resource_size_t size;
> +	struct resource res;
> +	const __be32 *prop;
> +	int ret;
> +
> +	exynos_acpm = devm_kzalloc(dev, sizeof(*exynos_acpm),
> GFP_KERNEL);
> +	if (!exynos_acpm)
> +		return -ENOMEM;
> +
> +	mbox = devm_kzalloc(dev, sizeof(*mbox), GFP_KERNEL);
> +	if (!mbox)
> +		return -ENOMEM;
> +
> +	exynos_acpm->regs = devm_platform_ioremap_resource(pdev, 0);
> +	if (IS_ERR(exynos_acpm->regs))
> +		return PTR_ERR(exynos_acpm->regs);
> +
> +	shmem = of_parse_phandle(node, "shmem", 0);
> +	ret = of_address_to_resource(shmem, 0, &res);
> +	of_node_put(shmem);
> +	if (ret) {
> +		dev_err(dev, "Failed to get shared memory.\n");
> +		return ret;
> +	}
> +
> +	size = resource_size(&res);
> +	exynos_acpm->sram_base = devm_ioremap(dev, res.start, size);
> +	if (!exynos_acpm->sram_base) {
> +		dev_err(dev, "Failed to ioremap shared memory.\n");
> +		return -ENOMEM;
> +	}
> +
> +	prop = of_get_property(node, "initdata-base", NULL);
> +	if (!prop) {
> +		dev_err(dev, "Parsing initdata_base failed.\n");
> +		return -EINVAL;
> +	}
> +
> +	exynos_acpm->pclk = devm_clk_get(dev, "pclk");
> +	if (IS_ERR(exynos_acpm->pclk)) {
> +		dev_err(dev, "Missing peripheral clock.\n");
> +		return PTR_ERR(exynos_acpm->pclk);
> +	}
> +
> +	ret = clk_prepare_enable(exynos_acpm->pclk);
> +	if (ret) {
> +		dev_err(dev, "Failed to enable the peripheral clock.\n");
> +		return ret;
> +	}
> +
> +	exynos_acpm->wq = alloc_workqueue("exynos-acpm-wq", 0, 0);
> +	if (!exynos_acpm->wq)
> +		return -ENOMEM;
> +
> +	exynos_acpm->dev = dev;
> +	exynos_acpm->mbox = mbox;
> +	exynos_acpm->shmem = exynos_acpm->sram_base +
> be32_to_cpup(prop);
> +
> +	ret = exynos_acpm_chans_init(exynos_acpm);
> +	if (ret)
> +		return ret;
> +
> +	mbox->num_chans = exynos_acpm->num_chans;
> +	mbox->dev = dev;
> +	mbox->ops = &exynos_acpm_chan_ops;
> +
> +	platform_set_drvdata(pdev, exynos_acpm);
> +
> +	/* Mask out all interrupts. We support just polling channels for
now.
> */
> +	writel_relaxed(EXYNOS_ACPM_INTMR0_MASK,
> +		       exynos_acpm->regs + EXYNOS_ACPM_INTMR0);
> +
> +	ret = devm_mbox_controller_register(dev, mbox);
> +	if (ret)
> +		dev_err(dev, "Failed to register mbox_controller(%d).\n",
> ret);
> +
> +	return ret;
> +}
> +
> +static void exynos_acpm_remove(struct platform_device *pdev) {
> +	struct exynos_acpm *exynos_acpm = platform_get_drvdata(pdev);
> +
> +	flush_workqueue(exynos_acpm->wq);
> +	destroy_workqueue(exynos_acpm->wq);
> +	clk_disable_unprepare(exynos_acpm->pclk);
> +}
> +
> +static struct platform_driver exynos_acpm_driver = {
> +	.probe	= exynos_acpm_probe,
> +	.remove = exynos_acpm_remove,
> +	.driver	= {
> +		.name = "exynos-acpm",
> +		.owner	= THIS_MODULE,
> +		.of_match_table	= exynos_acpm_match,
> +	},
> +};
> +module_platform_driver(exynos_acpm_driver);
> +
> +MODULE_AUTHOR("Tudor Ambarus <tudor.ambarus@linaro.org>");
> +MODULE_DESCRIPTION("EXYNOS ACPM mailbox driver");
> +MODULE_LICENSE("GPL");
> diff --git a/include/linux/mailbox/exynos-acpm-message.h
> b/include/linux/mailbox/exynos-acpm-message.h
> new file mode 100644
> index 000000000000..3799868c40b8
> --- /dev/null
> +++ b/include/linux/mailbox/exynos-acpm-message.h
> @@ -0,0 +1,21 @@
> +/* SPDX-License-Identifier: GPL-2.0-only */
> +/*
> + * Copyright 2024 Linaro Ltd.
> + */
> +
> +#ifndef _LINUX_EXYNOS_ACPM_MESSAGE_H_
> +#define _LINUX_EXYNOS_ACPM_MESSAGE_H_
> +
> +#include <linux/types.h>
> +
> +/**
> + * struct exynos_acpm_message - exynos ACPM mailbox message format.
> + * @cmd: pointer to u32 command.
> + * @len: length of the command.
> + */
> +struct exynos_acpm_message {
> +	u32 *cmd;
> +	size_t len;
> +};
> +
> +#endif /* _LINUX_EXYNOS_ACPM_MESSAGE_H_ */
> --
> 2.47.0.rc1.288.g06298d1525-goog




^ permalink raw reply	[flat|nested] 25+ messages in thread

* Re: [PATCH v2 2/2] firmware: add exynos acpm driver
  2024-10-21 14:12     ` Tudor Ambarus
@ 2024-10-22  4:38       ` Krzysztof Kozlowski
  2024-10-22  7:27         ` Vincent Guittot
  2024-10-22  7:58         ` Tudor Ambarus
  0 siblings, 2 replies; 25+ messages in thread
From: Krzysztof Kozlowski @ 2024-10-22  4:38 UTC (permalink / raw)
  To: Tudor Ambarus, jassisinghbrar
  Cc: alim.akhtar, mst, javierm, tzimmermann, bartosz.golaszewski,
	luzmaximilian, sudeep.holla, conor.dooley, bjorn, ulf.hansson,
	linux-samsung-soc, linux-kernel, linux-arm-kernel, marcan, neal,
	alyssa, broonie, andre.draszik, willmcvicker, peter.griffin,
	kernel-team, vincent.guittot, daniel.lezcano

On 21/10/2024 16:12, Tudor Ambarus wrote:
> Hi, Krzysztof,
> 
> On 10/21/24 12:52 PM, Krzysztof Kozlowski wrote:
>> On 17/10/2024 18:36, Tudor Ambarus wrote:
>>> ACPM (Alive Clock and Power Manager) is a firmware that operates on the
>>> APM (Active Power Management) module that handles overall power management
>>> activities. ACPM and masters regard each other as independent
>>> hardware component and communicate with each other using mailbox messages
>>> and shared memory.
>>>
>>> The mailbox channels are initialized based on the configuration data
>>> found at a specific offset into the shared memory (mmio-sram). The
>>> configuration data consists of channel id, message and queue lengths,
>>> pointers to the RX and TX queues which are also part of the SRAM, and
>>> whether RX works by polling or interrupts. All known clients of this
>>> driver are using polling channels, thus the driver implements for now
>>> just polling mode.
>>>
>>> Add support for the exynos acpm core driver. Helper drivers will follow.
>>> These will construct the mailbox messages in the format expected by the
>>> firmware.
>>
>> I skimmed through the driver and I do not understand why this is
>> firmware. You are implementing a mailbox provider/controller.
> 
> In my case the mailbox hardware is used just to raise the interrupt to
> the other side. Then there's the SRAM which contains the channels
> configuration data and the TX/RX queues. The enqueue/deque is done
> in/from SRAM. This resembles a lot with drivers/firmware/arm_scmi/, see:
> 
> drivers/firmware/arm_scmi/shmem.c
> drivers/firmware/arm_scmi/transports/mailbox.c

Wait, SCMI is an interface. Not the case here.

> 
> After the SRAM and mailbox/transport code I'll come up with two helper
> drivers that construct the mailbox messages in the format expected by
> the firmware. There are 2 types of messages recognized by the ACPM
> firmware: PMIC and DVFS. The client drivers will use these helper
> drivers to prepare a specific message. Then they will use the mailbox
> core to send the message and they'll wait for the answer.
> 
> This layered structure and the use of SRAM resembles with arm_scmi and
> made me think that the ACPM driver it's better suited for
> drivers/firmware. I'm opened for suggestions though.

Sure, but then this driver cannot perform mbox_controller_register().
Only mailbox providers, so drivers in mailbox, use it.

> 
>>
>> I did not perform full review yet, just skimmed over the code. I will
>> take a look a bit later.
>>
> 
> No worries, thanks for doing it. I agree with all the suggestions from
> below and I'll address them after we get an agreement with Jassi on how
> to extend the mailbox core.
> 
> More answers below.
> 
>>>
>>> Signed-off-by: Tudor Ambarus <tudor.ambarus@linaro.org>
>>> ---
>>>  drivers/firmware/Kconfig                    |   1 +
>>>  drivers/firmware/Makefile                   |   1 +
>>>  drivers/firmware/samsung/Kconfig            |  11 +
>>>  drivers/firmware/samsung/Makefile           |   3 +
>>>  drivers/firmware/samsung/exynos-acpm.c      | 703 ++++++++++++++++++++
>>
>> Please add directory to the Samsung Exynos SoC maintainer entry. I also
>> encourage adding separate entry for the driver where you would be listed
>> as maintainer.
> 
> ok
> 
>>
>> There is no firmware tree, so this will be going via Samsung SoC.
> 
> I noticed afterwards, thanks.
> 
>>
>>>  include/linux/mailbox/exynos-acpm-message.h |  21 +
>>>  6 files changed, 740 insertions(+)
>>>  create mode 100644 drivers/firmware/samsung/Kconfig
>>>  create mode 100644 drivers/firmware/samsung/Makefile
>>>  create mode 100644 drivers/firmware/samsung/exynos-acpm.c
>>>  create mode 100644 include/linux/mailbox/exynos-acpm-message.h
>>>
>>> diff --git a/drivers/firmware/Kconfig b/drivers/firmware/Kconfig
>>> index 71d8b26c4103..24edb956831b 100644
>>> --- a/drivers/firmware/Kconfig
>>> +++ b/drivers/firmware/Kconfig
>>> @@ -267,6 +267,7 @@ source "drivers/firmware/meson/Kconfig"
>>>  source "drivers/firmware/microchip/Kconfig"
>>>  source "drivers/firmware/psci/Kconfig"
>>>  source "drivers/firmware/qcom/Kconfig"
>>> +source "drivers/firmware/samsung/Kconfig"
>>>  source "drivers/firmware/smccc/Kconfig"
>>>  source "drivers/firmware/tegra/Kconfig"
>>>  source "drivers/firmware/xilinx/Kconfig"
>>> diff --git a/drivers/firmware/Makefile b/drivers/firmware/Makefile
>>> index 7a8d486e718f..91efcc868a05 100644
>>> --- a/drivers/firmware/Makefile
>>> +++ b/drivers/firmware/Makefile
>>> @@ -33,6 +33,7 @@ obj-y				+= efi/
>>>  obj-y				+= imx/
>>>  obj-y				+= psci/
>>>  obj-y				+= qcom/
>>> +obj-y				+= samsung/
>>>  obj-y				+= smccc/
>>>  obj-y				+= tegra/
>>>  obj-y				+= xilinx/
>>> diff --git a/drivers/firmware/samsung/Kconfig b/drivers/firmware/samsung/Kconfig
>>> new file mode 100644
>>> index 000000000000..f908773c1441
>>> --- /dev/null
>>> +++ b/drivers/firmware/samsung/Kconfig
>>> @@ -0,0 +1,11 @@
>>> +# SPDX-License-Identifier: GPL-2.0-only
>>> +
>>> +config EXYNOS_ACPM
>>> +	tristate "Exynos ACPM (Alive Clock and Power Manager) driver support"
>>
>> depends ARCH_EXYNOS || COMPILE_TEST
> 
> oh yes.
>>
>> Please also send a arm64 defconfig change making it a module.
> 
> will do
> 
> cut
> 
>>> +
>>> +/**
>>> + * struct exynos_acpm_shmem_chan - descriptor of a shared memory channel.
>>> + *
>>> + * @id:			channel ID.
>>> + * @reserved:		reserved for future use.
>>> + * @rx_rear:		rear pointer of RX queue.
>>> + * @rx_front:		front pointer of RX queue.
>>> + * @rx_base:		base address of RX queue.
>>> + * @reserved1:		reserved for future use.
>>> + * @tx_rear:		rear pointer of TX queue.
>>> + * @tx_front:		front pointer of TX queue.
>>> + * @tx_base:		base address of TX queue.
>>> + * @qlen:		queue length. Applies to both TX/RX queues.
>>> + * @mlen:		message length. Applies to both TX/RX queues.
>>> + * @reserved2:		reserved for future use.
>>> + * @polling:		true when the channel works on polling.
>>> + */
>>> +struct exynos_acpm_shmem_chan {
>>> +	u32 id;
>>> +	u32 reserved[3];
>>> +	u32 rx_rear;
>>> +	u32 rx_front;
>>> +	u32 rx_base;
>>> +	u32 reserved1[3];
>>> +	u32 tx_rear;
>>> +	u32 tx_front;
>>> +	u32 tx_base;
>>> +	u32 qlen;
>>> +	u32 mlen;
>>> +	u32 reserved2[2];
>>> +	u32 polling;
>>
>> Why are you storing addresses as u32? Shouldn't these be __iomem*?
> 
> This structure defines the offsets in SRAM that describe the channel
> parameters. Instances of this struct shall be declared indeed as:
> 	struct exynos_acpm_shmem_chan __iomem *shmem_chan;
> I missed that in v2, but will update in v2.
> 
>>
>> I also cannot find any piece of code setting several of above, e.g. tx_base
> 
> I'm not writing any SRAM configuration fields, these fields are used to
> read/retrive the channel parameters from SRAM.

I meany tx_base is always 0. Where is this property set? Ever?

> 
> cut
> 
>>> +
>>> +	spin_lock_irqsave(&chan->tx_lock, flags);
>>> +
>>> +	tx_front = readl_relaxed(chan->tx.front);
>>> +	idx = (tx_front + 1) % chan->qlen;
>>> +
>>> +	ret = exynos_acpm_wait_for_queue_slots(mbox_chan, idx);
>>> +	if (ret)
>>> +		goto exit;
>>> +
>>> +	exynos_acpm_prepare_request(mbox_chan, req);
>>> +
>>> +	/* Write TX command. */
>>> +	__iowrite32_copy(chan->tx.base + chan->mlen * tx_front, tx->cmd,
>>> +			 req->txlen / 4);
>>> +
>>> +	/* Advance TX front. */
>>> +	writel_relaxed(idx, chan->tx.front);
>>> +
>>> +	/* Flush SRAM posted writes. */
>>> +	readl_relaxed(chan->tx.front);
>>> +
>>
>> How does this flush work? Maybe you just miss here barries (s/relaxed//)?
> 
> I think _relaxed() accessors should be fine in this driver as there are
> no DMA accesses involved. _relaxed() accessors are fully ordered for
> accesses to the same device/endpoint.
> 
> I'm worried however about the posted writes, the buses the devices sit
> on may themselves have asynchronicity. So I issue a read from the same
> device to ensure that the write has occured.

Hm, ok, it seems it is actually standard way for posted bus.

> 
> There is something that I haven't dimistified though. You'll notice that
> the writes from above are on SRAM. I enqueue on the TX queue then
> advance the head/front of the queue and then I read back to make sure
> that the writes occured. Below I write to the controller's interrupt
> register (different device/endpoint) to raise the interrupt for the
> counterpart and inform that TX queue advanced. I'm not sure whether I
> need a barrier here to make sure that the CPU does not reorder the
> accesses and raise the interrupt before advancing the TX queue. If
> someone already knows the answer, please say, I'll do some more reading
> in the meantime.

I think it is fine.


Best regards,
Krzysztof



^ permalink raw reply	[flat|nested] 25+ messages in thread

* Re: [PATCH v2 2/2] firmware: add exynos acpm driver
  2024-10-22  4:38       ` Krzysztof Kozlowski
@ 2024-10-22  7:27         ` Vincent Guittot
  2024-10-22  7:58         ` Tudor Ambarus
  1 sibling, 0 replies; 25+ messages in thread
From: Vincent Guittot @ 2024-10-22  7:27 UTC (permalink / raw)
  To: Krzysztof Kozlowski
  Cc: Tudor Ambarus, jassisinghbrar, alim.akhtar, mst, javierm,
	tzimmermann, bartosz.golaszewski, luzmaximilian, sudeep.holla,
	conor.dooley, bjorn, ulf.hansson, linux-samsung-soc, linux-kernel,
	linux-arm-kernel, marcan, neal, alyssa, broonie, andre.draszik,
	willmcvicker, peter.griffin, kernel-team, daniel.lezcano

On Tue, 22 Oct 2024 at 06:38, Krzysztof Kozlowski <krzk@kernel.org> wrote:
>
> On 21/10/2024 16:12, Tudor Ambarus wrote:
> > Hi, Krzysztof,
> >
> > On 10/21/24 12:52 PM, Krzysztof Kozlowski wrote:
> >> On 17/10/2024 18:36, Tudor Ambarus wrote:
> >>> ACPM (Alive Clock and Power Manager) is a firmware that operates on the
> >>> APM (Active Power Management) module that handles overall power management
> >>> activities. ACPM and masters regard each other as independent
> >>> hardware component and communicate with each other using mailbox messages
> >>> and shared memory.
> >>>
> >>> The mailbox channels are initialized based on the configuration data
> >>> found at a specific offset into the shared memory (mmio-sram). The
> >>> configuration data consists of channel id, message and queue lengths,
> >>> pointers to the RX and TX queues which are also part of the SRAM, and
> >>> whether RX works by polling or interrupts. All known clients of this
> >>> driver are using polling channels, thus the driver implements for now
> >>> just polling mode.
> >>>
> >>> Add support for the exynos acpm core driver. Helper drivers will follow.
> >>> These will construct the mailbox messages in the format expected by the
> >>> firmware.
> >>
> >> I skimmed through the driver and I do not understand why this is
> >> firmware. You are implementing a mailbox provider/controller.
> >
> > In my case the mailbox hardware is used just to raise the interrupt to
> > the other side. Then there's the SRAM which contains the channels
> > configuration data and the TX/RX queues. The enqueue/deque is done
> > in/from SRAM. This resembles a lot with drivers/firmware/arm_scmi/, see:
> >
> > drivers/firmware/arm_scmi/shmem.c
> > drivers/firmware/arm_scmi/transports/mailbox.c
>
> Wait, SCMI is an interface. Not the case here.

How is it different from SCMI ? They both use mailbox and shared
memory to set a message in the SRAM and ping the other side with the
mailbox. The only diff is that SCMI is an public spec whereas this one
is specific to some SoC

>
> >
> > After the SRAM and mailbox/transport code I'll come up with two helper
> > drivers that construct the mailbox messages in the format expected by
> > the firmware. There are 2 types of messages recognized by the ACPM
> > firmware: PMIC and DVFS. The client drivers will use these helper
> > drivers to prepare a specific message. Then they will use the mailbox
> > core to send the message and they'll wait for the answer.
> >
> > This layered structure and the use of SRAM resembles with arm_scmi and
> > made me think that the ACPM driver it's better suited for
> > drivers/firmware. I'm opened for suggestions though.
>
> Sure, but then this driver cannot perform mbox_controller_register().
> Only mailbox providers, so drivers in mailbox, use it.

Yes, the mailbox driver part should go into mailbox and this part
should then use mbox_request_channel() to get a channel

>
> >
> >>
> >> I did not perform full review yet, just skimmed over the code. I will
> >> take a look a bit later.
> >>
> >
> > No worries, thanks for doing it. I agree with all the suggestions from
> > below and I'll address them after we get an agreement with Jassi on how
> > to extend the mailbox core.
> >
> > More answers below.
> >
> >>>
> >>> Signed-off-by: Tudor Ambarus <tudor.ambarus@linaro.org>
> >>> ---
> >>>  drivers/firmware/Kconfig                    |   1 +
> >>>  drivers/firmware/Makefile                   |   1 +
> >>>  drivers/firmware/samsung/Kconfig            |  11 +
> >>>  drivers/firmware/samsung/Makefile           |   3 +
> >>>  drivers/firmware/samsung/exynos-acpm.c      | 703 ++++++++++++++++++++
> >>
> >> Please add directory to the Samsung Exynos SoC maintainer entry. I also
> >> encourage adding separate entry for the driver where you would be listed
> >> as maintainer.
> >
> > ok
> >
> >>
> >> There is no firmware tree, so this will be going via Samsung SoC.
> >
> > I noticed afterwards, thanks.
> >
> >>
> >>>  include/linux/mailbox/exynos-acpm-message.h |  21 +
> >>>  6 files changed, 740 insertions(+)
> >>>  create mode 100644 drivers/firmware/samsung/Kconfig
> >>>  create mode 100644 drivers/firmware/samsung/Makefile
> >>>  create mode 100644 drivers/firmware/samsung/exynos-acpm.c
> >>>  create mode 100644 include/linux/mailbox/exynos-acpm-message.h
> >>>
> >>> diff --git a/drivers/firmware/Kconfig b/drivers/firmware/Kconfig
> >>> index 71d8b26c4103..24edb956831b 100644
> >>> --- a/drivers/firmware/Kconfig
> >>> +++ b/drivers/firmware/Kconfig
> >>> @@ -267,6 +267,7 @@ source "drivers/firmware/meson/Kconfig"
> >>>  source "drivers/firmware/microchip/Kconfig"
> >>>  source "drivers/firmware/psci/Kconfig"
> >>>  source "drivers/firmware/qcom/Kconfig"
> >>> +source "drivers/firmware/samsung/Kconfig"
> >>>  source "drivers/firmware/smccc/Kconfig"
> >>>  source "drivers/firmware/tegra/Kconfig"
> >>>  source "drivers/firmware/xilinx/Kconfig"
> >>> diff --git a/drivers/firmware/Makefile b/drivers/firmware/Makefile
> >>> index 7a8d486e718f..91efcc868a05 100644
> >>> --- a/drivers/firmware/Makefile
> >>> +++ b/drivers/firmware/Makefile
> >>> @@ -33,6 +33,7 @@ obj-y                             += efi/
> >>>  obj-y                              += imx/
> >>>  obj-y                              += psci/
> >>>  obj-y                              += qcom/
> >>> +obj-y                              += samsung/
> >>>  obj-y                              += smccc/
> >>>  obj-y                              += tegra/
> >>>  obj-y                              += xilinx/
> >>> diff --git a/drivers/firmware/samsung/Kconfig b/drivers/firmware/samsung/Kconfig
> >>> new file mode 100644
> >>> index 000000000000..f908773c1441
> >>> --- /dev/null
> >>> +++ b/drivers/firmware/samsung/Kconfig
> >>> @@ -0,0 +1,11 @@
> >>> +# SPDX-License-Identifier: GPL-2.0-only
> >>> +
> >>> +config EXYNOS_ACPM
> >>> +   tristate "Exynos ACPM (Alive Clock and Power Manager) driver support"
> >>
> >> depends ARCH_EXYNOS || COMPILE_TEST
> >
> > oh yes.
> >>
> >> Please also send a arm64 defconfig change making it a module.
> >
> > will do
> >
> > cut
> >
> >>> +
> >>> +/**
> >>> + * struct exynos_acpm_shmem_chan - descriptor of a shared memory channel.
> >>> + *
> >>> + * @id:                    channel ID.
> >>> + * @reserved:              reserved for future use.
> >>> + * @rx_rear:               rear pointer of RX queue.
> >>> + * @rx_front:              front pointer of RX queue.
> >>> + * @rx_base:               base address of RX queue.
> >>> + * @reserved1:             reserved for future use.
> >>> + * @tx_rear:               rear pointer of TX queue.
> >>> + * @tx_front:              front pointer of TX queue.
> >>> + * @tx_base:               base address of TX queue.
> >>> + * @qlen:          queue length. Applies to both TX/RX queues.
> >>> + * @mlen:          message length. Applies to both TX/RX queues.
> >>> + * @reserved2:             reserved for future use.
> >>> + * @polling:               true when the channel works on polling.
> >>> + */
> >>> +struct exynos_acpm_shmem_chan {
> >>> +   u32 id;
> >>> +   u32 reserved[3];
> >>> +   u32 rx_rear;
> >>> +   u32 rx_front;
> >>> +   u32 rx_base;
> >>> +   u32 reserved1[3];
> >>> +   u32 tx_rear;
> >>> +   u32 tx_front;
> >>> +   u32 tx_base;
> >>> +   u32 qlen;
> >>> +   u32 mlen;
> >>> +   u32 reserved2[2];
> >>> +   u32 polling;
> >>
> >> Why are you storing addresses as u32? Shouldn't these be __iomem*?
> >
> > This structure defines the offsets in SRAM that describe the channel
> > parameters. Instances of this struct shall be declared indeed as:
> >       struct exynos_acpm_shmem_chan __iomem *shmem_chan;
> > I missed that in v2, but will update in v2.
> >
> >>
> >> I also cannot find any piece of code setting several of above, e.g. tx_base
> >
> > I'm not writing any SRAM configuration fields, these fields are used to
> > read/retrive the channel parameters from SRAM.
>
> I meany tx_base is always 0. Where is this property set? Ever?
>
> >
> > cut
> >
> >>> +
> >>> +   spin_lock_irqsave(&chan->tx_lock, flags);
> >>> +
> >>> +   tx_front = readl_relaxed(chan->tx.front);
> >>> +   idx = (tx_front + 1) % chan->qlen;
> >>> +
> >>> +   ret = exynos_acpm_wait_for_queue_slots(mbox_chan, idx);
> >>> +   if (ret)
> >>> +           goto exit;
> >>> +
> >>> +   exynos_acpm_prepare_request(mbox_chan, req);
> >>> +
> >>> +   /* Write TX command. */
> >>> +   __iowrite32_copy(chan->tx.base + chan->mlen * tx_front, tx->cmd,
> >>> +                    req->txlen / 4);
> >>> +
> >>> +   /* Advance TX front. */
> >>> +   writel_relaxed(idx, chan->tx.front);
> >>> +
> >>> +   /* Flush SRAM posted writes. */
> >>> +   readl_relaxed(chan->tx.front);
> >>> +
> >>
> >> How does this flush work? Maybe you just miss here barries (s/relaxed//)?
> >
> > I think _relaxed() accessors should be fine in this driver as there are
> > no DMA accesses involved. _relaxed() accessors are fully ordered for
> > accesses to the same device/endpoint.
> >
> > I'm worried however about the posted writes, the buses the devices sit
> > on may themselves have asynchronicity. So I issue a read from the same
> > device to ensure that the write has occured.
>
> Hm, ok, it seems it is actually standard way for posted bus.
>
> >
> > There is something that I haven't dimistified though. You'll notice that
> > the writes from above are on SRAM. I enqueue on the TX queue then
> > advance the head/front of the queue and then I read back to make sure
> > that the writes occured. Below I write to the controller's interrupt
> > register (different device/endpoint) to raise the interrupt for the
> > counterpart and inform that TX queue advanced. I'm not sure whether I
> > need a barrier here to make sure that the CPU does not reorder the
> > accesses and raise the interrupt before advancing the TX queue. If
> > someone already knows the answer, please say, I'll do some more reading
> > in the meantime.
>
> I think it is fine.
>
>
> Best regards,
> Krzysztof
>


^ permalink raw reply	[flat|nested] 25+ messages in thread

* Re: [PATCH v2 2/2] firmware: add exynos acpm driver
  2024-10-22  4:38       ` Krzysztof Kozlowski
  2024-10-22  7:27         ` Vincent Guittot
@ 2024-10-22  7:58         ` Tudor Ambarus
  2024-10-23  9:00           ` Krzysztof Kozlowski
  1 sibling, 1 reply; 25+ messages in thread
From: Tudor Ambarus @ 2024-10-22  7:58 UTC (permalink / raw)
  To: Krzysztof Kozlowski, jassisinghbrar
  Cc: alim.akhtar, mst, javierm, tzimmermann, bartosz.golaszewski,
	luzmaximilian, sudeep.holla, conor.dooley, bjorn, ulf.hansson,
	linux-samsung-soc, linux-kernel, linux-arm-kernel, marcan, neal,
	alyssa, broonie, andre.draszik, willmcvicker, peter.griffin,
	kernel-team, vincent.guittot, daniel.lezcano

Hi, Krzysztof,

On 10/22/24 5:38 AM, Krzysztof Kozlowski wrote:

cut

>>> I skimmed through the driver and I do not understand why this is
>>> firmware. You are implementing a mailbox provider/controller.
>>
>> In my case the mailbox hardware is used just to raise the interrupt to
>> the other side. Then there's the SRAM which contains the channels
>> configuration data and the TX/RX queues. The enqueue/deque is done
>> in/from SRAM. This resembles a lot with drivers/firmware/arm_scmi/, see:
>>
>> drivers/firmware/arm_scmi/shmem.c
>> drivers/firmware/arm_scmi/transports/mailbox.c
> 
> Wait, SCMI is an interface. Not the case here.
> 
>>
>> After the SRAM and mailbox/transport code I'll come up with two helper
>> drivers that construct the mailbox messages in the format expected by
>> the firmware. There are 2 types of messages recognized by the ACPM
>> firmware: PMIC and DVFS. The client drivers will use these helper
>> drivers to prepare a specific message. Then they will use the mailbox
>> core to send the message and they'll wait for the answer.
>>
>> This layered structure and the use of SRAM resembles with arm_scmi and
>> made me think that the ACPM driver it's better suited for
>> drivers/firmware. I'm opened for suggestions though.
> 
> Sure, but then this driver cannot perform mbox_controller_register().
> Only mailbox providers, so drivers in mailbox, use it.
> 

Okay, I can move the driver to drivers/mailbox/.

cut

>>>> +/**
>>>> + * struct exynos_acpm_shmem_chan - descriptor of a shared memory channel.
>>>> + *
>>>> + * @id:			channel ID.
>>>> + * @reserved:		reserved for future use.
>>>> + * @rx_rear:		rear pointer of RX queue.
>>>> + * @rx_front:		front pointer of RX queue.
>>>> + * @rx_base:		base address of RX queue.
>>>> + * @reserved1:		reserved for future use.
>>>> + * @tx_rear:		rear pointer of TX queue.
>>>> + * @tx_front:		front pointer of TX queue.
>>>> + * @tx_base:		base address of TX queue.
>>>> + * @qlen:		queue length. Applies to both TX/RX queues.
>>>> + * @mlen:		message length. Applies to both TX/RX queues.
>>>> + * @reserved2:		reserved for future use.
>>>> + * @polling:		true when the channel works on polling.
>>>> + */
>>>> +struct exynos_acpm_shmem_chan {
>>>> +	u32 id;
>>>> +	u32 reserved[3];
>>>> +	u32 rx_rear;
>>>> +	u32 rx_front;
>>>> +	u32 rx_base;
>>>> +	u32 reserved1[3];
>>>> +	u32 tx_rear;
>>>> +	u32 tx_front;
>>>> +	u32 tx_base;
>>>> +	u32 qlen;
>>>> +	u32 mlen;
>>>> +	u32 reserved2[2];
>>>> +	u32 polling;
>>>

cut

>>>
>>> I also cannot find any piece of code setting several of above, e.g. tx_base
>>
>> I'm not writing any SRAM configuration fields, these fields are used to
>> read/retrive the channel parameters from SRAM.
> 
> I meany tx_base is always 0. Where is this property set? Ever?

It's not zero. My assumption is it is set in the acpm firmware, but I
don't have access to that to verify. Here are some debug prints made in
the linux driver:

[    0.069575][    T1] gs-acpm-ipc 17610000.mailbox:
exynos_mbox_chan_init ID = 2 poll = 1, mlen = 16, qlen = 5
[    0.069927][    T1] gs-acpm-ipc 17610000.mailbox:
exynos_mbox_chan_init ID = 2 offsets: rx_base = 0x00038290 rx_front =
0x0003828c, rx_rear = 0x00038288
[    0.070449][    T1] gs-acpm-ipc 17610000.mailbox:
exynos_mbox_chan_init ID = 2 offsets: tx_base = 0x000382f0 tx_front =
0x000382ec, tx_rear = 0x000382e8


tx_base contains the SRAM offset of the RX queue used in linux. The
offset is relative to the base address of the SRAM config data.

tx_base is seen/named from the firmware's point of view, thus named TX.
I assume the same struct is defined in the acpm firmware.


Somewhere below in the linux driver I get the RX ring base address by doing:

rx->base = exynos_acpm_get_iomem_addr(base, &shmem_chan->tx_base);

where base is the SRAM base address of the channels configuration data.

static void __iomem *exynos_acpm_get_iomem_addr(void __iomem *base,


                                                void __iomem *addr)


{


        u32 offset;





        offset = readl_relaxed(addr);


        return base + offset;


}

Hope this clarifies a bit these struct members.
Cheers,
ta


^ permalink raw reply	[flat|nested] 25+ messages in thread

* Re: [PATCH v2 1/2] mailbox: add async request mechanism to empower controllers w/ hw queues
  2024-10-21 16:32         ` Jassi Brar
@ 2024-10-22 13:26           ` Tudor Ambarus
  2024-10-24  1:27             ` Jassi Brar
  0 siblings, 1 reply; 25+ messages in thread
From: Tudor Ambarus @ 2024-10-22 13:26 UTC (permalink / raw)
  To: Jassi Brar
  Cc: krzk, alim.akhtar, mst, javierm, tzimmermann, bartosz.golaszewski,
	luzmaximilian, sudeep.holla, conor.dooley, bjorn, ulf.hansson,
	linux-samsung-soc, linux-kernel, linux-arm-kernel, marcan, neal,
	alyssa, broonie, andre.draszik, willmcvicker, peter.griffin,
	kernel-team, vincent.guittot, daniel.lezcano

Hi, Jassi,

On 10/21/24 5:32 PM, Jassi Brar wrote:
>> On 10/18/24 8:49 AM, Tudor Ambarus wrote:
> 
>> The active request is considered completed when TX completes. But it seems
>> that TX is not in direct relation with RX,
> 
> TX and RX are assumed independent operations (which they are).
> TX is sending a message/request to the remote successfully. 'Success'
> can be indicated by any of the three methods  TXDONE_BY_{POLLING, IRQ,
> ACK}.
> You seem to assume it should always be an ACK where we receive an
> acknowledgment/response packet, which is not the case.

My controller driver indeed ties TX to RX and considers the request
completed when the RX completes.

But other drivers can still complete the request at TXDONE if that's
what they need.

> 
> ...
> 
>>> Correct, and it is not meant to be.
>>> You are assuming there is always an RX in response to a TX, which is
> 
>> Not really. If there's no response expected, clients can set req->rx to NULL.
> 
> You are assuming the default behaviour is that there is a reply
> associated with a TX, otherwise "clients can set req->rx to NULL".
> This api can be _used_ only for protocols that expect a response for
> each TX. For other protocols,  it is simply a passthrough wrapper (by
> doing things like req->rx = NULL). For handling this special case of
> Tx->Rx, maybe a higher level common helper function would be better.
> 
> ...
> 
>>> reasons, it is left to the user to tie an incoming RX to some previous
>>> TX, or not.
> 
>> Is there a specific driver that I can look at in order to understand the
>> case where RX is not tied to TX?
> 
>  Many...
>  * The remote end owns sensors and can asynchronously send, say
> thermal, notifications to Linux.
>  * On some platform the RX may be asynchronous, that is sent later
> with some identifying tag of the past TX.
>  * Just reverse the roles of local and remote - remote sends us a
> request (RX for us) and we are supposed to send a response (TX).

I was hoping for a name of a driver, but I guess I can find them out
eventually.
> 
>> Also, if you think there's a better way to enable controllers to manage
>> their hardware queues, please say.
>>
> Tying RX to TX has nothing to do with hardware queues. There can be a

Right, I agree.

> hardware queue and the protocol can still be
> "local-to-remote-broadcast".
> 
> While I don't think we need the "Rx is in relation to some past Tx"
> api, I am open to ideas to better utilize h/w queues.
> 
> The h/w TX queue/fifo may hold, say, 8 messages while the controller
> transmits them to the remote. Currently it is implemented by
> .last_tx_done() returning true as long as fifo is not full.
> The other option, to have more than one TX in-flight, only complicates
> the mailbox api without fetching any real benefits because the
> protocol would still need to handle cases like Tx was successful but
> the remote rejected the request or the Rx failed on the h/w fifo
> (there could be rx-fifo too, right). Is where I am at right now.
> 
No worries, I'm confident we'll reach a conclusion.

It's true I implemented just my use case, but that doesn't mean that the
"request" approach can't be extended for current users.

If we replace throughout the mailbox core `void *message` with
`struct mbox_request *req`, all the users can still do their magic,
isn't it? The only difference would be that instead of having a
completion structure per channel, we have a completion structure per
request.

In order to have more than one TX in-flight, we need that each TX to
have its own completion struct. Then, for my case, if the clients expect
a response for each TX, then it shall be the responsibility of the
client to allocate space for RX. This is exactly what struct
mbox_request does:

+struct mbox_wait {
+	struct completion completion;
+	int err;
+};
+
+#define DECLARE_MBOX_WAIT(_wait) \
+	struct mbox_wait _wait = { \
+		COMPLETION_INITIALIZER_ONSTACK((_wait).completion), 0 }
+
+#define MBOX_REQ_MAY_SLEEP	BIT(0)
+
+struct mbox_request {
+	struct mbox_wait *wait;
+	void *tx;
+	void *rx;
+	unsigned int txlen;
+	unsigned int rxlen;
+	u32 flags;
+};

Also, in order to have more than one TX in-flight, we need to allow
controller drivers to bypass the mailbox core
one-active-request-at-a-time handling. The current software queue
handling mechanism from the mailbox core can be used as a backlog
mechanism: if the controller driver has no space to process a new
request (regardless if it has hardware queue or not), it can fallback to
the backlog mechanism where one request is handled at a time. The use of
the backlog mechanism shall be an opt-in choice.

Now clients that don't care about RX are allowed to not allocate space
for it, and consider the request completed at TX done, if that's what
they need.

What am I missing?

Thanks,
ta


^ permalink raw reply	[flat|nested] 25+ messages in thread

* Re: [PATCH v2 2/2] firmware: add exynos acpm driver
  2024-10-22  7:58         ` Tudor Ambarus
@ 2024-10-23  9:00           ` Krzysztof Kozlowski
  2024-10-23  9:53             ` Tudor Ambarus
  0 siblings, 1 reply; 25+ messages in thread
From: Krzysztof Kozlowski @ 2024-10-23  9:00 UTC (permalink / raw)
  To: Tudor Ambarus, jassisinghbrar
  Cc: alim.akhtar, mst, javierm, tzimmermann, bartosz.golaszewski,
	luzmaximilian, sudeep.holla, conor.dooley, bjorn, ulf.hansson,
	linux-samsung-soc, linux-kernel, linux-arm-kernel, marcan, neal,
	alyssa, broonie, andre.draszik, willmcvicker, peter.griffin,
	kernel-team, vincent.guittot, daniel.lezcano

On 22/10/2024 09:58, Tudor Ambarus wrote:
> 
>>>>
>>>> I also cannot find any piece of code setting several of above, e.g. tx_base
>>>
>>> I'm not writing any SRAM configuration fields, these fields are used to
>>> read/retrive the channel parameters from SRAM.
>>
>> I meany tx_base is always 0. Where is this property set? Ever?
> 
> It's not zero. My assumption is it is set in the acpm firmware, but I

Where is any assignment to this member?


> don't have access to that to verify. Here are some debug prints made in
> the linux driver:
> 
> [    0.069575][    T1] gs-acpm-ipc 17610000.mailbox:
> exynos_mbox_chan_init ID = 2 poll = 1, mlen = 16, qlen = 5
> [    0.069927][    T1] gs-acpm-ipc 17610000.mailbox:
> exynos_mbox_chan_init ID = 2 offsets: rx_base = 0x00038290 rx_front =
> 0x0003828c, rx_rear = 0x00038288
> [    0.070449][    T1] gs-acpm-ipc 17610000.mailbox:
> exynos_mbox_chan_init ID = 2 offsets: tx_base = 0x000382f0 tx_front =
> 0x000382ec, tx_rear = 0x000382e8
> 
> 
> tx_base contains the SRAM offset of the RX queue used in linux. The
> offset is relative to the base address of the SRAM config data.
> 
> tx_base is seen/named from the firmware's point of view, thus named TX.
> I assume the same struct is defined in the acpm firmware.
> 
> 
> Somewhere below in the linux driver I get the RX ring base address by doing:
> 
> rx->base = exynos_acpm_get_iomem_addr(base, &shmem_chan->tx_base);

tx_base is still 0.

> 
> where base is the SRAM base address of the channels configuration data.
> 
> static void __iomem *exynos_acpm_get_iomem_addr(void __iomem *base,
> 
> 
>                                                 void __iomem *addr)
> 
> 
> {
> 
> 
>         u32 offset;
> 
> 
> 
> 
> 
>         offset = readl_relaxed(addr);
> 
> 
>         return base + offset;
> 
> 
> }
> 
> Hope this clarifies a bit these struct members.

No, where is tx_base assigned?

Best regards,
Krzysztof



^ permalink raw reply	[flat|nested] 25+ messages in thread

* Re: [PATCH v2 2/2] firmware: add exynos acpm driver
  2024-10-23  9:00           ` Krzysztof Kozlowski
@ 2024-10-23  9:53             ` Tudor Ambarus
  2024-10-23 10:02               ` Tudor Ambarus
  2024-10-24  9:36               ` Krzysztof Kozlowski
  0 siblings, 2 replies; 25+ messages in thread
From: Tudor Ambarus @ 2024-10-23  9:53 UTC (permalink / raw)
  To: Krzysztof Kozlowski, jassisinghbrar
  Cc: alim.akhtar, mst, javierm, tzimmermann, bartosz.golaszewski,
	luzmaximilian, sudeep.holla, conor.dooley, bjorn, ulf.hansson,
	linux-samsung-soc, linux-kernel, linux-arm-kernel, marcan, neal,
	alyssa, broonie, andre.draszik, willmcvicker, peter.griffin,
	kernel-team, vincent.guittot, daniel.lezcano



On 10/23/24 10:00 AM, Krzysztof Kozlowski wrote:
>>>>> I also cannot find any piece of code setting several of above, e.g. tx_base
>>>> I'm not writing any SRAM configuration fields, these fields are used to
>>>> read/retrive the channel parameters from SRAM.
>>> I meany tx_base is always 0. Where is this property set? Ever?
>> It's not zero. My assumption is it is set in the acpm firmware, but I
> Where is any assignment to this member?

In probe() you'll see that exynos_acpm->shmem is a pointer in SRAM to a
struct exynos_acpm_shmem __iomem *shmem;

Then in:

static int exynos_acpm_chans_init()
{
	struct exynos_acpm_shmem_chan __iomem *shmem_chans, *shmem_chan;
	struct exynos_acpm_shmem __iomem *shmem = exynos_acpm->shmem;
	...

	shmem_chans = exynos_acpm_get_iomem_addr(exynos_acpm->sram_base,
						 &shmem->chans);
	...
}

shmem->chans is not initialized (or tx_base). I'm using its address in
SRAM (&shmem->chans) which I then read it with readl_relaxed().

I guess one can do the same using offsetof:
shmem_chans = readl_realaxed(shmem + offsetof(struct exynos_acpm_shmem,
					      chans));

Thanks,
ta


^ permalink raw reply	[flat|nested] 25+ messages in thread

* Re: [PATCH v2 2/2] firmware: add exynos acpm driver
  2024-10-23  9:53             ` Tudor Ambarus
@ 2024-10-23 10:02               ` Tudor Ambarus
  2024-10-24  9:36               ` Krzysztof Kozlowski
  1 sibling, 0 replies; 25+ messages in thread
From: Tudor Ambarus @ 2024-10-23 10:02 UTC (permalink / raw)
  To: Krzysztof Kozlowski, jassisinghbrar
  Cc: alim.akhtar, mst, javierm, tzimmermann, bartosz.golaszewski,
	luzmaximilian, sudeep.holla, conor.dooley, bjorn, ulf.hansson,
	linux-samsung-soc, linux-kernel, linux-arm-kernel, marcan, neal,
	alyssa, broonie, andre.draszik, willmcvicker, peter.griffin,
	kernel-team, vincent.guittot, daniel.lezcano



On 10/23/24 10:53 AM, Tudor Ambarus wrote:
> 
> 
> On 10/23/24 10:00 AM, Krzysztof Kozlowski wrote:
>>>>>> I also cannot find any piece of code setting several of above, e.g. tx_base
>>>>> I'm not writing any SRAM configuration fields, these fields are used to
>>>>> read/retrive the channel parameters from SRAM.
>>>> I meany tx_base is always 0. Where is this property set? Ever?
>>> It's not zero. My assumption is it is set in the acpm firmware, but I
>> Where is any assignment to this member?
> 
> In probe() you'll see that exynos_acpm->shmem is a pointer in SRAM to a
> struct exynos_acpm_shmem __iomem *shmem;
> 
> Then in:
> 
> static int exynos_acpm_chans_init()
> {
> 	struct exynos_acpm_shmem_chan __iomem *shmem_chans, *shmem_chan;
> 	struct exynos_acpm_shmem __iomem *shmem = exynos_acpm->shmem;
> 	...
> 
> 	shmem_chans = exynos_acpm_get_iomem_addr(exynos_acpm->sram_base,
> 						 &shmem->chans);
> 	...
> }
> 
> shmem->chans is not initialized (or tx_base). I'm using its address in
> SRAM (&shmem->chans) which I then read it with readl_relaxed().
> 
> I guess one can do the same using offsetof:
> shmem_chans = readl_realaxed(shmem + offsetof(struct exynos_acpm_shmem,
> 					      chans));
> 

I forgot to add the sram_base, the counter example should have been:

shmem_chans = exynos_acpm->sram_base +
	      readl_realaxed(shmem + offsetof(struct exynos_acpm_shmem,
					      chans));


^ permalink raw reply	[flat|nested] 25+ messages in thread

* Re: [PATCH v2 1/2] mailbox: add async request mechanism to empower controllers w/ hw queues
  2024-10-22 13:26           ` Tudor Ambarus
@ 2024-10-24  1:27             ` Jassi Brar
  2024-10-24 10:45               ` Tudor Ambarus
  0 siblings, 1 reply; 25+ messages in thread
From: Jassi Brar @ 2024-10-24  1:27 UTC (permalink / raw)
  To: Tudor Ambarus
  Cc: krzk, alim.akhtar, mst, javierm, tzimmermann, bartosz.golaszewski,
	luzmaximilian, sudeep.holla, conor.dooley, bjorn, ulf.hansson,
	linux-samsung-soc, linux-kernel, linux-arm-kernel, marcan, neal,
	alyssa, broonie, andre.draszik, willmcvicker, peter.griffin,
	kernel-team, vincent.guittot, daniel.lezcano

Hi Tudor,

On Tue, Oct 22, 2024 at 8:27 AM Tudor Ambarus <tudor.ambarus@linaro.org> wrote:
>
> Hi, Jassi,
>
> On 10/21/24 5:32 PM, Jassi Brar wrote:
> >> On 10/18/24 8:49 AM, Tudor Ambarus wrote:
> >
> >> The active request is considered completed when TX completes. But it seems
> >> that TX is not in direct relation with RX,
> >
> > TX and RX are assumed independent operations (which they are).
> > TX is sending a message/request to the remote successfully. 'Success'
> > can be indicated by any of the three methods  TXDONE_BY_{POLLING, IRQ,
> > ACK}.
> > You seem to assume it should always be an ACK where we receive an
> > acknowledgment/response packet, which is not the case.
>
> My controller driver indeed ties TX to RX and considers the request
> completed when the RX completes.
>
Does your controller require RX or the protocol? I suspect the latter.
Anyway, the former is also supported by TXDONE_BY_ACK already.

> >
> >> Is there a specific driver that I can look at in order to understand the
> >> case where RX is not tied to TX?
> >
> >  Many...
> >  * The remote end owns sensors and can asynchronously send, say
> > thermal, notifications to Linux.
> >  * On some platform the RX may be asynchronous, that is sent later
> > with some identifying tag of the past TX.
> >  * Just reverse the roles of local and remote - remote sends us a
> > request (RX for us) and we are supposed to send a response (TX).
>
> I was hoping for a name of a driver, but I guess I can find them out
> eventually.
>
Do these usecases seem impossible to you? Many users are not upstream
that we care
about as long as we are not making some corrective change.


> >
> >> Also, if you think there's a better way to enable controllers to manage
> >> their hardware queues, please say.
> >>
> > Tying RX to TX has nothing to do with hardware queues. There can be a
>
> Right, I agree.
>
> > hardware queue and the protocol can still be
> > "local-to-remote-broadcast".
> >
> > While I don't think we need the "Rx is in relation to some past Tx"
> > api, I am open to ideas to better utilize h/w queues.
> >
> > The h/w TX queue/fifo may hold, say, 8 messages while the controller
> > transmits them to the remote. Currently it is implemented by
> > .last_tx_done() returning true as long as fifo is not full.
> > The other option, to have more than one TX in-flight, only complicates
> > the mailbox api without fetching any real benefits because the
> > protocol would still need to handle cases like Tx was successful but
> > the remote rejected the request or the Rx failed on the h/w fifo
> > (there could be rx-fifo too, right). Is where I am at right now.
> >
> No worries, I'm confident we'll reach a conclusion.
>
> It's true I implemented just my use case, but that doesn't mean that the
> "request" approach can't be extended for current users.
>
Sorry, I am not sure we should make the api dance around your usecase.
If your usecase is not currently handled, please let me know. We can
discuss that.

Thanks.


^ permalink raw reply	[flat|nested] 25+ messages in thread

* Re: [PATCH v2 2/2] firmware: add exynos acpm driver
  2024-10-23  9:53             ` Tudor Ambarus
  2024-10-23 10:02               ` Tudor Ambarus
@ 2024-10-24  9:36               ` Krzysztof Kozlowski
  2024-10-25 10:00                 ` Tudor Ambarus
  1 sibling, 1 reply; 25+ messages in thread
From: Krzysztof Kozlowski @ 2024-10-24  9:36 UTC (permalink / raw)
  To: Tudor Ambarus, jassisinghbrar
  Cc: alim.akhtar, mst, javierm, tzimmermann, bartosz.golaszewski,
	luzmaximilian, sudeep.holla, conor.dooley, bjorn, ulf.hansson,
	linux-samsung-soc, linux-kernel, linux-arm-kernel, marcan, neal,
	alyssa, broonie, andre.draszik, willmcvicker, peter.griffin,
	kernel-team, vincent.guittot, daniel.lezcano

On 23/10/2024 11:53, Tudor Ambarus wrote:
> 
> 
> On 10/23/24 10:00 AM, Krzysztof Kozlowski wrote:
>>>>>> I also cannot find any piece of code setting several of above, e.g. tx_base
>>>>> I'm not writing any SRAM configuration fields, these fields are used to
>>>>> read/retrive the channel parameters from SRAM.
>>>> I meany tx_base is always 0. Where is this property set? Ever?
>>> It's not zero. My assumption is it is set in the acpm firmware, but I
>> Where is any assignment to this member?
> 
> In probe() you'll see that exynos_acpm->shmem is a pointer in SRAM to a
> struct exynos_acpm_shmem __iomem *shmem;
> 
> Then in:
> 
> static int exynos_acpm_chans_init()
> {
> 	struct exynos_acpm_shmem_chan __iomem *shmem_chans, *shmem_chan;
> 	struct exynos_acpm_shmem __iomem *shmem = exynos_acpm->shmem;
> 	...
> 
> 	shmem_chans = exynos_acpm_get_iomem_addr(exynos_acpm->sram_base,
> 						 &shmem->chans);
> 	...
> }
> 
> shmem->chans is not initialized (or tx_base). I'm using its address in
> SRAM (&shmem->chans) which I then read it with readl_relaxed().
> 
> I guess one can do the same using offsetof:
> shmem_chans = readl_realaxed(shmem + offsetof(struct exynos_acpm_shmem,
> 					      chans));
> 

I see, the code and the naming is confusing. Two exynos_acpm_shmem_chan
variables and one exynos_acpm_shmem. shmem_chans is used as an array,
but nowhere pointed or indicated that it is array of some size.

All this could be clearer if exynos_acpm_shmem_chan was packed, because
then it is obvious it points to defined memory, but maybe packed is not
correct here? Probably splitting all this into logical chunks would be
useful. Like not mixing reading offsets with reading values, because I
really have to spend a lot of time to identify which one is which in
exynos_acpm_chans_init().

Best regards,
Krzysztof



^ permalink raw reply	[flat|nested] 25+ messages in thread

* Re: [PATCH v2 1/2] mailbox: add async request mechanism to empower controllers w/ hw queues
  2024-10-24  1:27             ` Jassi Brar
@ 2024-10-24 10:45               ` Tudor Ambarus
  2024-10-29 15:59                 ` Jassi Brar
  0 siblings, 1 reply; 25+ messages in thread
From: Tudor Ambarus @ 2024-10-24 10:45 UTC (permalink / raw)
  To: Jassi Brar
  Cc: krzk, alim.akhtar, mst, javierm, tzimmermann, bartosz.golaszewski,
	luzmaximilian, sudeep.holla, conor.dooley, bjorn, ulf.hansson,
	linux-samsung-soc, linux-kernel, linux-arm-kernel, marcan, neal,
	alyssa, broonie, andre.draszik, willmcvicker, peter.griffin,
	kernel-team, vincent.guittot, daniel.lezcano

On 10/24/24 2:27 AM, Jassi Brar wrote:
> Hi Tudor,
> 

Hi, Jassi!

I appreciate that you respond quickly on my emails, thanks!

> On Tue, Oct 22, 2024 at 8:27 AM Tudor Ambarus <tudor.ambarus@linaro.org> wrote:
>>
>> Hi, Jassi,
>>
>> On 10/21/24 5:32 PM, Jassi Brar wrote:
>>>> On 10/18/24 8:49 AM, Tudor Ambarus wrote:
>>>
>>>> The active request is considered completed when TX completes. But it seems
>>>> that TX is not in direct relation with RX,
>>>
>>> TX and RX are assumed independent operations (which they are).
>>> TX is sending a message/request to the remote successfully. 'Success'
>>> can be indicated by any of the three methods  TXDONE_BY_{POLLING, IRQ,
>>> ACK}.
>>> You seem to assume it should always be an ACK where we receive an
>>> acknowledgment/response packet, which is not the case.
>>
>> My controller driver indeed ties TX to RX and considers the request
>> completed when the RX completes.
>>
> Does your controller require RX or the protocol? I suspect the latter.

The response from remote always happens for the acpm protocol. Same for
the plain (no acpm or SRAM involved) mailbox controller that I see in
downstream.

While the response from remote always happens, the RX data is optional.
Clients can choose whether they want the data from the RX ring or not
(see `struct exynos_acpm_rx_data`).

For each message that is sent on the TX ring, it is expected that the
remote consumes it. The remote gets the message from the TX queue,
advances the rear index of the TX ring, processes the request, writes
the response message (if any) on the linux RX ring and advances the
front index of the linux RX ring.

> Anyway, the former is also supported by TXDONE_BY_ACK already.

If we want to focus on when TX is done and really be strict about it,
then for my case TX can be considered done when the remote consumes it.
I need to poll and check when the linux TX ring rear index is moved by
the remote. Other option is to poll the IRQ status register of the
remote to see when the request was consumed. So maybe TXDONE_BY_POLL is
more accurate?
TX can also be considered done after what we write to TX ring hits the
endpoint-SRAM, thus neither of these flags needed.

As a side nodeto IRQs, the acpm protocol allows channels to work either
in polling or in IRQ mode. I expect in the future we'll need to specify
the done method per channel and not per controller. IRQs are not used in
the downstream, thus I didn't touch them in this proposal.

> 
>>>
>>>> Is there a specific driver that I can look at in order to understand the
>>>> case where RX is not tied to TX?
>>>
>>>  Many...
>>>  * The remote end owns sensors and can asynchronously send, say
>>> thermal, notifications to Linux.
>>>  * On some platform the RX may be asynchronous, that is sent later
>>> with some identifying tag of the past TX.
>>>  * Just reverse the roles of local and remote - remote sends us a
>>> request (RX for us) and we are supposed to send a response (TX).
>>
>> I was hoping for a name of a driver, but I guess I can find them out
>> eventually.
>>
> Do these usecases seem impossible to you? Many users are not upstream

They seem fine, I just wanted to see the implementation and decide
whether the request approach can be applied to them or not. I think it can.

> that we care
> about as long as we are not making some corrective change.>
> 
>>>
>>>> Also, if you think there's a better way to enable controllers to manage
>>>> their hardware queues, please say.
>>>>
>>> Tying RX to TX has nothing to do with hardware queues. There can be a
>>
>> Right, I agree.
>>
>>> hardware queue and the protocol can still be
>>> "local-to-remote-broadcast".
>>>
>>> While I don't think we need the "Rx is in relation to some past Tx"
>>> api, I am open to ideas to better utilize h/w queues.
>>>
>>> The h/w TX queue/fifo may hold, say, 8 messages while the controller
>>> transmits them to the remote. Currently it is implemented by
>>> .last_tx_done() returning true as long as fifo is not full.
>>> The other option, to have more than one TX in-flight, only complicates
>>> the mailbox api without fetching any real benefits because the
>>> protocol would still need to handle cases like Tx was successful but
>>> the remote rejected the request or the Rx failed on the h/w fifo
>>> (there could be rx-fifo too, right). Is where I am at right now.
>>>
>> No worries, I'm confident we'll reach a conclusion.
>>
>> It's true I implemented just my use case, but that doesn't mean that the
>> "request" approach can't be extended for current users.
>>
> Sorry, I am not sure we should make the api dance around your usecase.

No worries, it's fine to disagree.

> If your usecase is not currently handled, please let me know. We can
> discuss that.

It's not handled. I have a list of requirements I have to fulfill which
are not covered by the mailbox core:

1/ allow multiple TX in-flight. We need to let the controller handle its
hardware queue, otherwise the hardware queue has no sense at all.
2/ allow to tie a TX to a RX. I need to know to what TX the response
corresponds to.
3/ allow multiple clients to the same channel. ACPM allows this. Support
would have come as an extra patch.
4/ allow polling and IRQ channels for the same mailbox controller (not
urgent).

I guess that we agree that in order to allow multiple TX in-flight we
need a completion struct per message/request and not per channel. But we
disagree on the ability to tie a TX to a RX.

How can I move forward with this? No better options come to mind right
now. Shall I take the apple approach and move everything to drivers/soc?
If any of you has another idea, shoot.

Thanks,
ta

^ permalink raw reply	[flat|nested] 25+ messages in thread

* Re: [PATCH v2 2/2] firmware: add exynos acpm driver
  2024-10-21 16:47   ` Alim Akhtar
@ 2024-10-25  9:44     ` Tudor Ambarus
  0 siblings, 0 replies; 25+ messages in thread
From: Tudor Ambarus @ 2024-10-25  9:44 UTC (permalink / raw)
  To: Alim Akhtar, krzk
  Cc: mst, javierm, tzimmermann, bartosz.golaszewski, luzmaximilian,
	sudeep.holla, conor.dooley, bjorn, ulf.hansson, linux-samsung-soc,
	linux-kernel, linux-arm-kernel, marcan, neal, alyssa, broonie,
	andre.draszik, willmcvicker, peter.griffin, kernel-team,
	vincent.guittot, daniel.lezcano, Jassi Brar



On 10/21/24 5:47 PM, Alim Akhtar wrote:
> Hi Tudor

Hi, Alim!

Thanks for the review!

>> diff --git a/drivers/firmware/samsung/Kconfig
>> b/drivers/firmware/samsung/Kconfig
>> new file mode 100644
>> index 000000000000..f908773c1441
>> --- /dev/null
>> +++ b/drivers/firmware/samsung/Kconfig
>> @@ -0,0 +1,11 @@
>> +# SPDX-License-Identifier: GPL-2.0-only
>> +
>> +config EXYNOS_ACPM
> 
> This looks misleading to me, as you mentioned below, ACPM is a FW which runs
> on APM module, and 
> The proposed driver is a communication method between Application processor
> and APM module,
> Which is via MAILBOX.
> So preferably EXYNOS_MAILBOX_APM is more meaningful here.

This seems more accurate indeed. The design document that I have refers
to the protocol as "ACPM IPC", so maybe I shall stick with EXYNOS_ACPM_IPC.

I'll also need to prefix all the definitions in the driver with
exynos_acpm_ipc_* which will result in long names. But I guess there's
nothing we can do about it.

Cheers,
ta


^ permalink raw reply	[flat|nested] 25+ messages in thread

* Re: [PATCH v2 2/2] firmware: add exynos acpm driver
  2024-10-24  9:36               ` Krzysztof Kozlowski
@ 2024-10-25 10:00                 ` Tudor Ambarus
  0 siblings, 0 replies; 25+ messages in thread
From: Tudor Ambarus @ 2024-10-25 10:00 UTC (permalink / raw)
  To: Krzysztof Kozlowski, jassisinghbrar
  Cc: alim.akhtar, mst, javierm, tzimmermann, bartosz.golaszewski,
	luzmaximilian, sudeep.holla, conor.dooley, bjorn, ulf.hansson,
	linux-samsung-soc, linux-kernel, linux-arm-kernel, marcan, neal,
	alyssa, broonie, andre.draszik, willmcvicker, peter.griffin,
	kernel-team, vincent.guittot, daniel.lezcano



On 10/24/24 10:36 AM, Krzysztof Kozlowski wrote:
> On 23/10/2024 11:53, Tudor Ambarus wrote:
>>
>>
>> On 10/23/24 10:00 AM, Krzysztof Kozlowski wrote:
>>>>>>> I also cannot find any piece of code setting several of above, e.g. tx_base
>>>>>> I'm not writing any SRAM configuration fields, these fields are used to
>>>>>> read/retrive the channel parameters from SRAM.
>>>>> I meany tx_base is always 0. Where is this property set? Ever?
>>>> It's not zero. My assumption is it is set in the acpm firmware, but I
>>> Where is any assignment to this member?
>>
>> In probe() you'll see that exynos_acpm->shmem is a pointer in SRAM to a
>> struct exynos_acpm_shmem __iomem *shmem;
>>
>> Then in:
>>
>> static int exynos_acpm_chans_init()
>> {
>> 	struct exynos_acpm_shmem_chan __iomem *shmem_chans, *shmem_chan;
>> 	struct exynos_acpm_shmem __iomem *shmem = exynos_acpm->shmem;
>> 	...
>>
>> 	shmem_chans = exynos_acpm_get_iomem_addr(exynos_acpm->sram_base,
>> 						 &shmem->chans);
>> 	...
>> }
>>
>> shmem->chans is not initialized (or tx_base). I'm using its address in
>> SRAM (&shmem->chans) which I then read it with readl_relaxed().
>>
>> I guess one can do the same using offsetof:
>> shmem_chans = readl_realaxed(shmem + offsetof(struct exynos_acpm_shmem,
>> 					      chans));
>>
> 
> I see, the code and the naming is confusing. Two exynos_acpm_shmem_chan

Noted. I'll refactor exynos_acpm_chans_init() in the next version.

> variables and one exynos_acpm_shmem. shmem_chans is used as an array,
> but nowhere pointed or indicated that it is array of some size.
>

I understand , will update. I added documentation for `struct
exynos_acpm_shmem` describing the array of chans and the number of
chans, but I'll figure something more, to be clearer.

> All this could be clearer if exynos_acpm_shmem_chan was packed, because
> then it is obvious it points to defined memory, but maybe packed is not

__packed shall be alright, but it's not needed because all the members
of the struct are u32 and the address of the struct is u64 aligned.

> correct here? Probably splitting all this into logical chunks would be
> useful. Like not mixing reading offsets with reading values, because I
> really have to spend a lot of time to identify which one is which in
> exynos_acpm_chans_init().
> 

I understand, will update. Need to figure out what other options we have
with the mailbox core changes first. Thanks for the suggestions!

Cheers,
ta


^ permalink raw reply	[flat|nested] 25+ messages in thread

* Re: [PATCH v2 1/2] mailbox: add async request mechanism to empower controllers w/ hw queues
  2024-10-24 10:45               ` Tudor Ambarus
@ 2024-10-29 15:59                 ` Jassi Brar
  2024-11-18 15:22                   ` Tudor Ambarus
  0 siblings, 1 reply; 25+ messages in thread
From: Jassi Brar @ 2024-10-29 15:59 UTC (permalink / raw)
  To: Tudor Ambarus
  Cc: krzk, alim.akhtar, mst, javierm, tzimmermann, bartosz.golaszewski,
	luzmaximilian, sudeep.holla, conor.dooley, bjorn, ulf.hansson,
	linux-samsung-soc, linux-kernel, linux-arm-kernel, marcan, neal,
	alyssa, broonie, andre.draszik, willmcvicker, peter.griffin,
	kernel-team, vincent.guittot, daniel.lezcano

On Thu, Oct 24, 2024 at 5:45 AM Tudor Ambarus <tudor.ambarus@linaro.org> wrote:
>
>
>
> On 10/24/24 2:27 AM, Jassi Brar wrote:
> > Hi Tudor,
> >
>
> Hi, Jassi!
>
> I appreciate that you respond quickly on my emails, thanks!
>
> > On Tue, Oct 22, 2024 at 8:27 AM Tudor Ambarus <tudor.ambarus@linaro.org> wrote:
> >>
> >> Hi, Jassi,
> >>
> >> On 10/21/24 5:32 PM, Jassi Brar wrote:
> >>>> On 10/18/24 8:49 AM, Tudor Ambarus wrote:
> >>>
> >>>> The active request is considered completed when TX completes. But it seems
> >>>> that TX is not in direct relation with RX,
> >>>
> >>> TX and RX are assumed independent operations (which they are).
> >>> TX is sending a message/request to the remote successfully. 'Success'
> >>> can be indicated by any of the three methods  TXDONE_BY_{POLLING, IRQ,
> >>> ACK}.
> >>> You seem to assume it should always be an ACK where we receive an
> >>> acknowledgment/response packet, which is not the case.
> >>
> >> My controller driver indeed ties TX to RX and considers the request
> >> completed when the RX completes.
> >>
> > Does your controller require RX or the protocol? I suspect the latter.
>
> The response from remote always happens for the acpm protocol. Same for
> the plain (no acpm or SRAM involved) mailbox controller that I see in
> downstream.
>
> While the response from remote always happens, the RX data is optional.
> Clients can choose whether they want the data from the RX ring or not
> (see `struct exynos_acpm_rx_data`).
>
> For each message that is sent on the TX ring, it is expected that the
> remote consumes it. The remote gets the message from the TX queue,
> advances the rear index of the TX ring, processes the request, writes
> the response message (if any) on the linux RX ring and advances the
> front index of the linux RX ring.
>
> > Anyway, the former is also supported by TXDONE_BY_ACK already.
>
> If we want to focus on when TX is done and really be strict about it,
> then for my case TX can be considered done when the remote consumes it.
> I need to poll and check when the linux TX ring rear index is moved by
> the remote. Other option is to poll the IRQ status register of the
> remote to see when the request was consumed. So maybe TXDONE_BY_POLL is
> more accurate?
> TX can also be considered done after what we write to TX ring hits the
> endpoint-SRAM, thus neither of these flags needed.
>
> As a side nodeto IRQs, the acpm protocol allows channels to work either
> in polling or in IRQ mode. I expect in the future we'll need to specify
> the done method per channel and not per controller. IRQs are not used in
> the downstream, thus I didn't touch them in this proposal.
>
> >
> >>>
> >>>> Is there a specific driver that I can look at in order to understand the
> >>>> case where RX is not tied to TX?
> >>>
> >>>  Many...
> >>>  * The remote end owns sensors and can asynchronously send, say
> >>> thermal, notifications to Linux.
> >>>  * On some platform the RX may be asynchronous, that is sent later
> >>> with some identifying tag of the past TX.
> >>>  * Just reverse the roles of local and remote - remote sends us a
> >>> request (RX for us) and we are supposed to send a response (TX).
> >>
> >> I was hoping for a name of a driver, but I guess I can find them out
> >> eventually.
> >>
> > Do these usecases seem impossible to you? Many users are not upstream
>
> They seem fine, I just wanted to see the implementation and decide
> whether the request approach can be applied to them or not. I think it can.
>
> > that we care
> > about as long as we are not making some corrective change.>
> >
> >>>
> >>>> Also, if you think there's a better way to enable controllers to manage
> >>>> their hardware queues, please say.
> >>>>
> >>> Tying RX to TX has nothing to do with hardware queues. There can be a
> >>
> >> Right, I agree.
> >>
> >>> hardware queue and the protocol can still be
> >>> "local-to-remote-broadcast".
> >>>
> >>> While I don't think we need the "Rx is in relation to some past Tx"
> >>> api, I am open to ideas to better utilize h/w queues.
> >>>
> >>> The h/w TX queue/fifo may hold, say, 8 messages while the controller
> >>> transmits them to the remote. Currently it is implemented by
> >>> .last_tx_done() returning true as long as fifo is not full.
> >>> The other option, to have more than one TX in-flight, only complicates
> >>> the mailbox api without fetching any real benefits because the
> >>> protocol would still need to handle cases like Tx was successful but
> >>> the remote rejected the request or the Rx failed on the h/w fifo
> >>> (there could be rx-fifo too, right). Is where I am at right now.
> >>>
> >> No worries, I'm confident we'll reach a conclusion.
> >>
> >> It's true I implemented just my use case, but that doesn't mean that the
> >> "request" approach can't be extended for current users.
> >>
> > Sorry, I am not sure we should make the api dance around your usecase.
>
> No worries, it's fine to disagree.
>
> > If your usecase is not currently handled, please let me know. We can
> > discuss that.
>
> It's not handled. I have a list of requirements I have to fulfill which
> are not covered by the mailbox core:
>
> 1/ allow multiple TX in-flight. We need to let the controller handle its
> hardware queue, otherwise the hardware queue has no sense at all.
>
As I said this one is currently handled by assuming TX-done by
depositing in the h/w queue/fifo.
You will have the same perf as with your attempt to have "multiple
in-flight" while neither of these approaches handles in-flight
failures. We can discuss this.

> 2/ allow to tie a TX to a RX. I need to know to what TX the response
> corresponds to.
> 3/ allow multiple clients to the same channel. ACPM allows this. Support
> would have come as an extra patch.
>
These are nothing new. A few other platforms too have shared channels
and that is implemented above the mailbox.

> 4/ allow polling and IRQ channels for the same mailbox controller (not
> urgent).
>
It is very easy to populate them as separate controllers.

> I guess that we agree that in order to allow multiple TX in-flight we
> need a completion struct per message/request and not per channel. But we
> disagree on the ability to tie a TX to a RX.
>
> How can I move forward with this?
>
As I already explained, to tie RX to a TX is very protocol specific
and should be done in the layer above the mailbox api.

Thanks.


^ permalink raw reply	[flat|nested] 25+ messages in thread

* Re: [PATCH v2 1/2] mailbox: add async request mechanism to empower controllers w/ hw queues
  2024-10-29 15:59                 ` Jassi Brar
@ 2024-11-18 15:22                   ` Tudor Ambarus
  0 siblings, 0 replies; 25+ messages in thread
From: Tudor Ambarus @ 2024-11-18 15:22 UTC (permalink / raw)
  To: Jassi Brar
  Cc: krzk, alim.akhtar, mst, javierm, tzimmermann, bartosz.golaszewski,
	luzmaximilian, sudeep.holla, conor.dooley, bjorn, ulf.hansson,
	linux-samsung-soc, linux-kernel, linux-arm-kernel, marcan, neal,
	alyssa, broonie, andre.draszik, willmcvicker, peter.griffin,
	kernel-team, vincent.guittot, daniel.lezcano, Arnd Bergmann

Hi, Jassi,

Sorry for the late reply, I was off for a while.

On 10/29/24 3:59 PM, Jassi Brar wrote:
>>> If your usecase is not currently handled, please let me know. We can
>>> discuss that.
>> It's not handled. I have a list of requirements I have to fulfill which
>> are not covered by the mailbox core:
>>
>> 1/ allow multiple TX in-flight. We need to let the controller handle its
>> hardware queue, otherwise the hardware queue has no sense at all.
>>
> As I said this one is currently handled by assuming TX-done by
> depositing in the h/w queue/fifo.

This may work indeed. I would have a TXDONE_BY_ACK controller. Its
`.send_data` would be a one liner, where I just raise the interrupt to
the firmware. TX ring would be written by `cl->tx_prepare()`.

Then in the protocol driver I would do:
ret = mbox_send_message(mbox_chan, msg);
if (ret < 0)
	return ret;

/* TX-done when depositing in the h/w queue. */
mbox_client_txdone(mbox_chan, 0);

ret = exynos_acpm_wait_for_message_response(mbox_chan, msg);
if (ret)
	return ret;

> You will have the same perf as with your attempt to have "multiple

I'm still forced to pass all the messages to the mailbox's core software
queue. I also don't need the locking from the core. For my case the mbox
core just needs to do:

if (chan->cl->tx_prepare)
	chan->cl->tx_prepare(chan->cl, data);

return chan->mbox->ops->send_data(chan, data);

Would it make sense to introduce such a method?

BTW, what are the minimum requirements for a controller to be considered
a mailbox controller? As you saw, I'm using the mailbox controller just
to raise the interrupt to the firmware, no messages are written to the
controller's buffers.

Does message size matter? On my device the ACPM protocol is using
messages of 0, 2, 16 or 32 bytes, but it also allows the possibility to
pass a pointer to an indirection command SRAM buffer, where one is able
to write up to 412 bytes.

> in-flight" while neither of these approaches handles in-flight
> failures. We can discuss this.

Do you mean that there's no retry mechanism? The crypto subsystem has
one, we may look at what's done there if we care.

> 
>> 2/ allow to tie a TX to a RX. I need to know to what TX the response
>> corresponds to.
>> 3/ allow multiple clients to the same channel. ACPM allows this. Support
>> would have come as an extra patch.
>>
> These are nothing new. A few other platforms too have shared channels
> and that is implemented above the mailbox.
> 
okay

>> 4/ allow polling and IRQ channels for the same mailbox controller (not
>> urgent).
>>
> It is very easy to populate them as separate controllers.

Do you mean to call devm_mbox_controller_register() twice for the same dev?

Thanks,
ta

^ permalink raw reply	[flat|nested] 25+ messages in thread

end of thread, other threads:[~2024-11-18 15:23 UTC | newest]

Thread overview: 25+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2024-10-17 16:36 [PATCH v2 0/2] mailbox: add async request mechanism w/ a user Tudor Ambarus
2024-10-17 16:36 ` [PATCH v2 1/2] mailbox: add async request mechanism to empower controllers w/ hw queues Tudor Ambarus
2024-10-18  4:17   ` Jassi Brar
2024-10-18  7:49     ` Tudor Ambarus
2024-10-21  6:18       ` Tudor Ambarus
2024-10-21 16:32         ` Jassi Brar
2024-10-22 13:26           ` Tudor Ambarus
2024-10-24  1:27             ` Jassi Brar
2024-10-24 10:45               ` Tudor Ambarus
2024-10-29 15:59                 ` Jassi Brar
2024-11-18 15:22                   ` Tudor Ambarus
2024-10-17 16:36 ` [PATCH v2 2/2] firmware: add exynos acpm driver Tudor Ambarus
2024-10-21 11:52   ` Krzysztof Kozlowski
2024-10-21 14:12     ` Tudor Ambarus
2024-10-22  4:38       ` Krzysztof Kozlowski
2024-10-22  7:27         ` Vincent Guittot
2024-10-22  7:58         ` Tudor Ambarus
2024-10-23  9:00           ` Krzysztof Kozlowski
2024-10-23  9:53             ` Tudor Ambarus
2024-10-23 10:02               ` Tudor Ambarus
2024-10-24  9:36               ` Krzysztof Kozlowski
2024-10-25 10:00                 ` Tudor Ambarus
2024-10-21 14:52   ` Tudor Ambarus
2024-10-21 16:47   ` Alim Akhtar
2024-10-25  9:44     ` Tudor Ambarus

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).