linux-arm-msm.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: Lina Iyer <ilina@codeaurora.org>
To: Bjorn Andersson <bjorn.andersson@linaro.org>
Cc: andy.gross@linaro.org, david.brown@linaro.org,
	linux-arm-msm@vger.kernel.org, linux-soc@vger.kernel.org,
	rnayak@codeaurora.org, linux-kernel@vger.kernel.org,
	sboyd@kernel.org, evgreen@chromium.org, dianders@chromium.org
Subject: Re: [PATCH v5 01/10] drivers: qcom: rpmh-rsc: add RPMH controller for QCOM SoCs
Date: Fri, 13 Apr 2018 10:16:52 -0600	[thread overview]
Message-ID: <20180413161652.GB1209@codeaurora.org> (raw)
In-Reply-To: <20180410234042.GC6727@builder>

On Tue, Apr 10 2018 at 17:40 -0600, Bjorn Andersson wrote:
>On Thu 05 Apr 09:18 PDT 2018, Lina Iyer wrote:
>[..]
>> diff --git a/drivers/soc/qcom/rpmh-internal.h b/drivers/soc/qcom/rpmh-internal.h
>[..]
>> +/**
>> + * struct tcs_response: Response object for a request
>> + *
>> + * @drv:   the controller
>> + * @msg:   the request for this response
>> + * @m:     the tcs identifier
>> + * @err:   error reported in the response
>> + * @list:  element in list of pending response objects
>> + */
>> +struct tcs_response {
>> +	struct rsc_drv *drv;
>> +	const struct tcs_request *msg;
>> +	u32 m;
>
>m is assigned in one place but never used.
>
Right. Remnant from the downstream driver that uses buffers of
responses.

>> +	int err;
>> +	struct list_head list;
>> +};
>[..]
>> diff --git a/drivers/soc/qcom/rpmh-rsc.c b/drivers/soc/qcom/rpmh-rsc.c
>[..]
>> +static struct tcs_group *get_tcs_from_index(struct rsc_drv *drv, int m)
>> +{
>> +	struct tcs_group *tcs;
>> +	int i;
>> +
>> +	for (i = 0; i < drv->num_tcs; i++) {
>> +		tcs = &drv->tcs[i];
>> +		if (tcs->mask & BIT(m))
>> +			return tcs;
>> +	}
>> +
>> +	WARN(i == drv->num_tcs, "Incorrect TCS index %d", m);
>> +
>> +	return NULL;
>> +}
>> +
>> +static struct tcs_response *setup_response(struct rsc_drv *drv,
>> +					   const struct tcs_request *msg, int m)
>> +{
>> +	struct tcs_response *resp;
>> +	struct tcs_group *tcs;
>> +
>> +	resp = kzalloc(sizeof(*resp), GFP_ATOMIC);
>
>I still don't like the idea that you allocate a response struct for each
>request, then upon getting an ack post this on a list and schedule a
>tasklet in order to optionally deliver the return value to the waiting
>caller.
>
>Why don't you just just add the "err" and a completion to the
>tcs_request struct and if it's a sync operation you complete that in
>your irq handler?
>
>That would remove the response struct, the list of them, the tasklet and
>the dynamic memory handling - at the "cost" of making the code possible
>to follow.
>
Hmm.. Ok. Will try to simplify.

>> +	if (!resp)
>> +		return ERR_PTR(-ENOMEM);
>> +
>> +	resp->drv = drv;
>> +	resp->msg = msg;
>> +	resp->err = 0;
>> +
>> +	tcs = get_tcs_from_index(drv, m);
>> +	if (!tcs)
>> +		return ERR_PTR(-EINVAL);
>> +
>> +	assert_spin_locked(&tcs->lock);
>
>I tried to boot the kernel with the rpmh-clk and rpmh-regulator drivers
>and I kept hitting this assert.
>
>Turns out that find_free_tcs() finds an empty TCS with index 'm' within
>the tcs, then passes it to setup_response() which tries to use the 'm'
>to figure out which tcs contains the TCS we're operating on.
>
>But as 'm' is in tcs-local space and get_tcs_from_index() tries to
>lookup the TCS in the global drv space we get hold of the wrong TCS.
>
You are right. I will fix it. Thanks for point out. Wonder what is in
your DT, that caused this to be triggered. Clearly it's missing my
setup.

>> +	tcs->responses[m - tcs->offset] = resp;
>> +
>> +	return resp;
>> +}
>> +
>> +static void free_response(struct tcs_response *resp)
>> +{
>> +	kfree(resp);
>> +}
>> +
>> +static struct tcs_response *get_response(struct rsc_drv *drv, u32 m)
>> +{
>> +	struct tcs_group *tcs = get_tcs_from_index(drv, m);
>> +
>> +	return tcs->responses[m - tcs->offset];
>> +}
>> +
>> +static u32 read_tcs_reg(struct rsc_drv *drv, int reg, int m, int n)
>> +{
>> +	return readl_relaxed(drv->tcs_base + reg + RSC_DRV_TCS_OFFSET * m +
>> +			     RSC_DRV_CMD_OFFSET * n);
>> +}
>> +
>> +static void write_tcs_reg(struct rsc_drv *drv, int reg, int m, int n, u32 data)
>> +{
>> +	writel_relaxed(data, drv->tcs_base + reg + RSC_DRV_TCS_OFFSET * m +
>> +		       RSC_DRV_CMD_OFFSET * n);
>
>Do you really want this relaxed? Isn't the ordering of these
>significant?
>
The ordering isnt. I can make it not relaxed. Only ordering requirement
is that we tigger after writing everything.

>> +}
>> +
>> +static void write_tcs_reg_sync(struct rsc_drv *drv, int reg, int m, int n,
>> +			       u32 data)
>> +{
>> +	write_tcs_reg(drv, reg, m, n, data);
>> +	for (;;) {
>> +		if (data == read_tcs_reg(drv, reg, m, n))
>> +			break;
>> +		udelay(1);
>> +	}
>> +}
>> +
>> +static bool tcs_is_free(struct rsc_drv *drv, int m)
>> +{
>> +	return !test_bit(m, drv->tcs_in_use) &&
>> +	       read_tcs_reg(drv, RSC_DRV_STATUS, m, 0);
>> +}
>> +
>> +static struct tcs_group *get_tcs_of_type(struct rsc_drv *drv, int type)
>
>According to rpmh_rsc_probe() the tcs array is indexed by "type", so you
>can replace the entire function with:
>
>	return &drv->tcs[type];
>
Hmm. Ok.

>> +{
>> +	int i;
>> +	struct tcs_group *tcs;
>> +
>> +	for (i = 0; i < TCS_TYPE_NR; i++) {
>> +		if (type == drv->tcs[i].type)
>> +			break;
>> +	}
>> +
>> +	if (i == TCS_TYPE_NR)
>> +		return ERR_PTR(-EINVAL);
>> +
>> +	tcs = &drv->tcs[i];
>> +	if (!tcs->num_tcs)
>> +		return ERR_PTR(-EINVAL);
>> +
>> +	return tcs;
>> +}
>> +
>> +static struct tcs_group *get_tcs_for_msg(struct rsc_drv *drv,
>> +					 const struct tcs_request *msg)
>> +{
>> +	int type;
>> +
>> +	switch (msg->state) {
>> +	case RPMH_ACTIVE_ONLY_STATE:
>> +		type = ACTIVE_TCS;
>> +		break;
>> +	default:
>> +		return ERR_PTR(-EINVAL);
>> +	}
>> +
>> +	return get_tcs_of_type(drv, type);
>> +}
>> +
>> +static void send_tcs_response(struct tcs_response *resp)
>> +{
>> +	struct rsc_drv *drv;
>> +	unsigned long flags;
>> +
>> +	if (!resp)
>> +		return;
>> +
>> +	drv = resp->drv;
>> +	spin_lock_irqsave(&drv->drv_lock, flags);
>> +	INIT_LIST_HEAD(&resp->list);
>> +	list_add_tail(&resp->list, &drv->response_pending);
>> +	spin_unlock_irqrestore(&drv->drv_lock, flags);
>> +
>> +	tasklet_schedule(&drv->tasklet);
>> +}
>> +
>> +/**
>> + * tcs_irq_handler: TX Done interrupt handler
>> + */
>> +static irqreturn_t tcs_irq_handler(int irq, void *p)
>> +{
>> +	struct rsc_drv *drv = p;
>> +	int m, i;
>> +	u32 irq_status, sts;
>> +	struct tcs_response *resp;
>> +	struct tcs_cmd *cmd;
>> +
>> +	irq_status = read_tcs_reg(drv, RSC_DRV_IRQ_STATUS, 0, 0);
>> +
>> +	for (m = 0; m < drv->num_tcs; m++) {
>> +		if (!(irq_status & (u32)BIT(m)))
>> +			continue;
>> +
>> +		resp = get_response(drv, m);
>> +		if (WARN_ON(!resp))
>
>This will only ever fail in the beginning of time, as soon as you've
>utilized every TCS at least once resp will never be NULL, as you never
>clear it.
>
>> +			goto skip_resp;
>> +
>> +		resp->err = 0;
>> +		for (i = 0; i < resp->msg->num_cmds; i++) {
>> +			cmd = &resp->msg->cmds[i];
>> +			sts = read_tcs_reg(drv, RSC_DRV_CMD_STATUS, m, i);
>> +			if (!(sts & CMD_STATUS_ISSUED) ||
>> +			   ((resp->msg->wait_for_compl || cmd->wait) &&
>> +			   !(sts & CMD_STATUS_COMPL))) {
>> +				resp->err = -EIO;
>> +				break;
>> +			}
>> +		}
>> +skip_resp:
>> +		/* Reclaim the TCS */
>> +		write_tcs_reg(drv, RSC_DRV_CMD_ENABLE, m, 0, 0);
>> +		write_tcs_reg(drv, RSC_DRV_IRQ_CLEAR, 0, 0, BIT(m));
>> +		clear_bit(m, drv->tcs_in_use);
>> +		send_tcs_response(resp);
>
>As I suggested above, rather than putting resp on a list and schedule a
>tasklet to free and possibly deliver the value or "err" to a client just
>keep track of the current msg for the TCS for sync operations, set "err"
>and fire the completion (and then untie the request from the TCS).
>
Ok.

>> +	}
>> +
>> +	return IRQ_HANDLED;
>> +}
>> +
>> +/**
>> + * tcs_notify_tx_done: TX Done for requests that got a response
>> + *
>> + * @data: the tasklet argument
>> + *
>> + * Tasklet function to notify MBOX that we are done with the request.
>> + * Handles all pending reponses whenever run.
>
>This is accidental complexity from the downstream use of the mailbox
>framework, we don't need it.
>
Yes. will remove this.

>> + */
>> +static void tcs_notify_tx_done(unsigned long data)
>> +{
>> +	struct rsc_drv *drv = (struct rsc_drv *)data;
>> +	struct tcs_response *resp;
>> +	unsigned long flags;
>> +
>> +	for (;;) {
>> +		spin_lock_irqsave(&drv->drv_lock, flags);
>> +		resp = list_first_entry_or_null(&drv->response_pending,
>> +						struct tcs_response, list);
>> +		if (!resp) {
>> +			spin_unlock_irqrestore(&drv->drv_lock, flags);
>> +			break;
>> +		}
>> +		list_del(&resp->list);
>> +		spin_unlock_irqrestore(&drv->drv_lock, flags);
>> +		free_response(resp);
>> +	}
>> +}
>> +
>> +static void __tcs_buffer_write(struct rsc_drv *drv, int m, int n,
>> +			       const struct tcs_request *msg)
>> +{
>> +	u32 msgid, cmd_msgid;
>> +	u32 cmd_enable = 0;
>> +	u32 cmd_complete;
>> +	struct tcs_cmd *cmd;
>> +	int i, j;
>> +
>> +	cmd_msgid = CMD_MSGID_LEN;
>> +	cmd_msgid |= msg->wait_for_compl ? CMD_MSGID_RESP_REQ : 0;
>> +	cmd_msgid |= CMD_MSGID_WRITE;
>> +
>> +	cmd_complete = read_tcs_reg(drv, RSC_DRV_CMD_WAIT_FOR_CMPL, m, 0);
>> +
>> +	for (i = 0, j = n; i < msg->num_cmds; i++, j++) {
>> +		cmd = &msg->cmds[i];
>> +		cmd_enable |= BIT(j);
>> +		cmd_complete |= cmd->wait << j;
>> +		msgid = cmd_msgid;
>> +		msgid |= cmd->wait ? CMD_MSGID_RESP_REQ : 0;
>> +		write_tcs_reg(drv, RSC_DRV_CMD_MSGID, m, j, msgid);
>> +		write_tcs_reg(drv, RSC_DRV_CMD_ADDR, m, j, cmd->addr);
>> +		write_tcs_reg(drv, RSC_DRV_CMD_DATA, m, j, cmd->data);
>> +	}
>> +
>> +	write_tcs_reg(drv, RSC_DRV_CMD_WAIT_FOR_CMPL, m, 0, cmd_complete);
>> +	cmd_enable |= read_tcs_reg(drv, RSC_DRV_CMD_ENABLE, m, 0);
>> +	write_tcs_reg(drv, RSC_DRV_CMD_ENABLE, m, 0, cmd_enable);
>> +}
>> +
>> +static void __tcs_trigger(struct rsc_drv *drv, int m)
>> +{
>> +	u32 enable;
>
>"enable"?
>
Sorry?
>> +
>> +	/*
>> +	 * HW req: Clear the DRV_CONTROL and enable TCS again
>> +	 * While clearing ensure that the AMC mode trigger is cleared
>> +	 * and then the mode enable is cleared.
>> +	 */
>> +	enable = read_tcs_reg(drv, RSC_DRV_CONTROL, m, 0);
>> +	enable &= ~TCS_AMC_MODE_TRIGGER;
>> +	write_tcs_reg_sync(drv, RSC_DRV_CONTROL, m, 0, enable);
>> +	enable &= ~TCS_AMC_MODE_ENABLE;
>> +	write_tcs_reg_sync(drv, RSC_DRV_CONTROL, m, 0, enable);
>> +
>> +	/* Enable the AMC mode on the TCS and then trigger the TCS */
>> +	enable = TCS_AMC_MODE_ENABLE;
>> +	write_tcs_reg_sync(drv, RSC_DRV_CONTROL, m, 0, enable);
>> +	enable |= TCS_AMC_MODE_TRIGGER;
>> +	write_tcs_reg_sync(drv, RSC_DRV_CONTROL, m, 0, enable);
>> +}
>> +
>> +static int check_for_req_inflight(struct rsc_drv *drv, struct tcs_group *tcs,
>> +				  const struct tcs_request *msg)
>> +{
>> +	unsigned long curr_enabled;
>> +	u32 addr;
>> +	int i, j, k;
>> +	int m = tcs->offset;
>> +
>> +	for (i = 0; i < tcs->num_tcs; i++, m++) {
>> +		if (tcs_is_free(drv, m))
>> +			continue;
>> +
>> +		curr_enabled = read_tcs_reg(drv, RSC_DRV_CMD_ENABLE, m, 0);
>> +
>> +		for_each_set_bit(j, &curr_enabled, MAX_CMDS_PER_TCS) {
>> +			addr = read_tcs_reg(drv, RSC_DRV_CMD_ADDR, m, j);
>> +			for (k = 0; k < msg->num_cmds; k++) {
>> +				if (addr == msg->cmds[k].addr)
>> +					return -EBUSY;
>> +			}
>> +		}
>> +	}
>> +
>> +	return 0;
>> +}
>> +
>> +static int find_free_tcs(struct tcs_group *tcs)
>> +{
>> +	int m;
>> +
>> +	for (m = 0; m < tcs->num_tcs; m++) {
>> +		if (tcs_is_free(tcs->drv, tcs->offset + m))
>> +			return m;
>
>The returned index is within the tcs but is passed to setup_response()
>where it's used as the index of the TCS, so this needs to return
>tcs->offset + m so that setup_response() will be able to find the tcs
>again.
>
Correct. Will fix.

>> +	}
>> +
>> +	return -EBUSY;
>> +}
>> +
>> +static int tcs_mbox_write(struct rsc_drv *drv, const struct tcs_request *msg)
>> +{
>> +	struct tcs_group *tcs;
>> +	int m;
>> +	struct tcs_response *resp = NULL;
>
>No need to initialize resp.
>
Ok.

>> +	unsigned long flags;
>> +	int ret;
>> +
>> +	tcs = get_tcs_for_msg(drv, msg);
>> +	if (IS_ERR(tcs))
>> +		return PTR_ERR(tcs);
>> +
>> +	spin_lock_irqsave(&tcs->lock, flags);
>> +	m = find_free_tcs(tcs);
>> +	if (m < 0) {
>> +		ret = m;
>> +		goto done_write;
>> +	}
>> +
>> +	/*
>> +	 * The h/w does not like if we send a request to the same address,
>> +	 * when one is already in-flight or being processed.
>> +	 */
>> +	ret = check_for_req_inflight(drv, tcs, msg);
>
>This scans all TCS in the DRV for any operations on msg->cmds[*].addr,
>but you're only holding a lock for tcs. Either cross-tcs operations
>doesn't matter and check_for_req_inflight() can loose one of the loops
>or the locking used here is too optimistic.
>
We only need to look at the AMC TCSes and see if any of them are in use.
We dont care about the request being present in sleep/wake TCS.

>> +	if (ret)
>> +		goto done_write;
>> +
>> +	resp = setup_response(drv, msg, m);
>
>Alternatively we could just actually pass "tcs" to setup_response() so
>that it doesn't have to search for it based on drv and m. But I think
>it's cleaner if we just associate the msg with the TCS and complete that
>directly in the irq handler - if it's a sync operation.
>
Got it.

>> +	if (IS_ERR(resp)) {
>> +		ret = PTR_ERR(resp);
>> +		goto done_write;
>> +	}
>> +	resp->m = m;
>
>You never read resp->m...
>
Remnant. Will remove the structure itself.

>> +
>> +	set_bit(m, drv->tcs_in_use);
>> +	__tcs_buffer_write(drv, m, 0, msg);
>> +	__tcs_trigger(drv, m);
>> +
>> +done_write:
>> +	spin_unlock_irqrestore(&tcs->lock, flags);
>> +	return ret;
>> +}
>> +
>> +/**
>> + * rpmh_rsc_send_data: Validate the incoming message and write to the
>> + * appropriate TCS block.
>> + *
>> + * @drv: the controller
>> + * @msg: the data to be sent
>> + *
>> + * Return: 0 on success, -EINVAL on error.
>> + * Note: This call blocks until a valid data is written to the TCS.
>> + */
>> +int rpmh_rsc_send_data(struct rsc_drv *drv, const struct tcs_request *msg)
>> +{
>> +	int ret;
>> +
>> +	if (!msg || !msg->cmds || !msg->num_cmds ||
>> +	    msg->num_cmds > MAX_RPMH_PAYLOAD)
>> +		return -EINVAL;
>
>You're the only caller of this function, which means that if this ever
>evaluates to true you will return -EINVAL and your bug will be way
>harder to find than if you just end up panicing because we dereferenced
>any of these null pointers.
>
>At least wrap the whole thing in a WARN_ON() to make it possible to
>detect when this happen.
>
Ok. Will do.

>> +
>> +	do {
>> +		ret = tcs_mbox_write(drv, msg);
>> +		if (ret == -EBUSY) {
>> +			pr_info_ratelimited("TCS Busy, retrying RPMH message send: addr=%#x\n",
>> +					    msg->cmds[0].addr);
>> +			udelay(10);
>> +		}
>> +	} while (ret == -EBUSY);
>> +
>> +	return ret;
>> +}
>> +EXPORT_SYMBOL(rpmh_rsc_send_data);
>

Thanks for the review.

-- Lina

  reply	other threads:[~2018-04-13 16:16 UTC|newest]

Thread overview: 37+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2018-04-05 16:18 [PATCH v5 00/10] drivers/qcom: add RPMH communication suppor Lina Iyer
2018-04-05 16:18 ` [PATCH v5 01/10] drivers: qcom: rpmh-rsc: add RPMH controller for QCOM SoCs Lina Iyer
2018-04-10 23:40   ` Bjorn Andersson
2018-04-13 16:16     ` Lina Iyer [this message]
     [not found]   ` <152342153642.180276.4927130412537860735@swboyd.mtv.corp.google.com>
2018-04-13 15:37     ` Lina Iyer
2018-04-13 17:43       ` Stephen Boyd
2018-04-16 19:51         ` Lina Iyer
2018-04-05 16:18 ` [PATCH v5 02/10] dt-bindings: introduce RPMH RSC bindings for Qualcomm SoCs Lina Iyer
2018-04-07  1:14   ` Stephen Boyd
2018-04-09 16:08     ` Lina Iyer
2018-04-10 19:36       ` Bjorn Andersson
2018-04-11 21:26         ` Lina Iyer
2018-04-11 15:29       ` Stephen Boyd
2018-04-11 21:24         ` Lina Iyer
2018-04-13 22:40           ` Stephen Boyd
2018-04-16 16:08             ` Lina Iyer
2018-04-17  6:01               ` Stephen Boyd
2018-04-18 19:31                 ` Lina Iyer
2018-04-05 16:18 ` [PATCH v5 03/10] drivers: qcom: rpmh-rsc: log RPMH requests in FTRACE Lina Iyer
2018-04-05 18:32   ` Steven Rostedt
2018-04-05 16:18 ` [PATCH v5 04/10] drivers: qcom: rpmh: add RPMH helper functions Lina Iyer
2018-04-07  1:21   ` Stephen Boyd
2018-04-09 15:36     ` Lina Iyer
2018-04-11  2:23       ` Stephen Boyd
2018-04-11 16:34         ` Lina Iyer
2018-04-11  0:01   ` Bjorn Andersson
2018-04-13 17:18     ` Stephen Boyd
2018-04-05 16:18 ` [PATCH v5 05/10] drivers: qcom: rpmh-rsc: write sleep/wake requests to TCS Lina Iyer
2018-04-11  0:31   ` Bjorn Andersson
2018-04-11 16:33     ` Lina Iyer
2018-04-05 16:18 ` [PATCH v5 06/10] drivers: qcom: rpmh-rsc: allow invalidation of sleep/wake TCS Lina Iyer
2018-04-05 16:18 ` [PATCH v5 07/10] drivers: qcom: rpmh: cache sleep/wake state requests Lina Iyer
2018-04-19  6:12   ` Stephen Boyd
2018-04-19 15:06     ` Lina Iyer
2018-04-05 16:18 ` [PATCH v5 08/10] drivers: qcom: rpmh: allow requests to be sent asynchronously Lina Iyer
2018-04-05 16:18 ` [PATCH v5 09/10] drivers: qcom: rpmh: add support for batch RPMH request Lina Iyer
2018-04-05 16:18 ` [PATCH v5 10/10] drivers: qcom: rpmh-rsc: allow active requests from wake TCS Lina Iyer

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20180413161652.GB1209@codeaurora.org \
    --to=ilina@codeaurora.org \
    --cc=andy.gross@linaro.org \
    --cc=bjorn.andersson@linaro.org \
    --cc=david.brown@linaro.org \
    --cc=dianders@chromium.org \
    --cc=evgreen@chromium.org \
    --cc=linux-arm-msm@vger.kernel.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-soc@vger.kernel.org \
    --cc=rnayak@codeaurora.org \
    --cc=sboyd@kernel.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).