From: Alexey Klimov <alexey.klimov@arm.com>
To: Jassi Brar <jaswinder.singh@linaro.org>
Cc: Jassi Brar <jassisinghbrar@gmail.com>,
Sudeep Holla <sudeep.holla@arm.com>,
Linux Kernel Mailing List <linux-kernel@vger.kernel.org>,
alexey.klimov@arm.com
Subject: Re: [PATCH RFC] mailbox: move controller timer to per-channel timers
Date: Tue, 30 May 2017 17:29:56 +0100 [thread overview]
Message-ID: <20170530162956.GA4974@arm.com> (raw)
In-Reply-To: <CAJe_ZhfcQAqsunYaCpV86xPqYBB7Lwj=-kHPdLxnoRi71uumGQ@mail.gmail.com>
On Thu, May 25, 2017 at 6:43 PM, Alexey Klimov <alexey.klimov@arm.com> wrote:
> Hi Jassi,
>
> sorry for delay again.
>
> On Tue, Apr 11, 2017 at 06:30:08PM +0530, Jassi Brar wrote:
> > On 11 April 2017 at 18:04, Alexey Klimov <alexey.klimov@arm.com> wrote:
> > > On Fri, Apr 07, 2017 at 08:39:35PM +0530, Jassi Brar wrote:
> > >> On Thu, Apr 6, 2017 at 11:01 PM, Alexey Klimov <alexey.klimov@arm.com> wrote:
> > >> > When mailbox controller provides two or more channels and
> > >> > they are actively used by mailbox client(s) it's very easy
> > >> > to trigger the warning in hrtimer_forward():
> > >> >
> > >> > [ 247.853060] WARNING: CPU: 6 PID: 0 at kernel/time/hrtimer.c:805 hrtimer_forward+0x88/0xd8
> > >> > [ 247.853549] Modules linked in:
> > >> > [ 247.853907] CPU: 6 PID: 0 Comm: swapper/6 Tainted: G W 4.11.0-rc2-00362-g93afaa4513bb-dirty #13
> > >> > [ 247.854472] Hardware name: linux,dummy-virt (DT)
> > >> > [ 247.854699] task: ffff80001d89d780 task.stack: ffff80001d8c4000
> > >> > [ 247.854999] PC is at hrtimer_forward+0x88/0xd8
> > >> > [ 247.855280] LR is at txdone_hrtimer+0xd4/0xf8
> > >> > [ 247.855551] pc : [<ffff0000081039f0>] lr : [<ffff00000881b874>] pstate: 200001c5
> > >> > [ 247.855857] sp : ffff80001efbdeb0
> > >> > [ 247.856072] x29: ffff80001efbdeb0 x28: ffff80001efc3140
> > >> > [ 247.856358] x27: ffff00000881b7a0 x26: 00000039ac93e8b6
> > >> > [ 247.856604] x25: ffff000008e756be x24: ffff80001c4a1348
> > >> > [ 247.856882] x23: 0000000000000001 x22: 00000000000000f8
> > >> > [ 247.857189] x21: ffff80001c4a1318 x20: ffff80001d327110
> > >> > [ 247.857509] x19: 00000000000f4240 x18: 0000000000000030
> > >> > [ 247.857808] x17: 0000ffffaecdf370 x16: ffff0000081ccc80
> > >> > [ 247.858000] x15: 0000000000000010 x14: 00000000fffffff0
> > >> > [ 247.858186] x13: ffff000008f488e0 x12: 000000000002e3eb
> > >> > [ 247.858381] x11: ffff000008979690 x10: 0000000000000000
> > >> > [ 247.858573] x9 : 0000000000000001 x8 : ffff80001efc66e0
> > >> > [ 247.858758] x7 : ffff80001efc6708 x6 : 00000005be7732f2
> > >> > [ 247.858943] x5 : 0000000000000001 x4 : ffff80001c4a1348
> > >> > [ 247.859130] x3 : 00000039ac94952a x2 : 00000000000f4240
> > >> > [ 247.859315] x1 : 00000039ac98243c x0 : 0000000000038f12
> > >> > [ 247.859582] ---[ end trace d61812426ec3c30b ]---
> > >> >
> > >> > To fix this current patch migrates hr timers to be per-channel
> > >> > instead of using only one timer per-controller.
> > >> >
> > >> I think we can do by just checking if hrtimer_active() returns false
> > >> before we do hrtimer_start() in msg_submit() ?
> > >
> > > It looks like it can be easily broken:
> > >
> > > 1) let's say first thread executes timer callback and already checked last_tx_done
> > > on channel 0;
> > > 2) second thread submits a message to the controller, say, on channel 0 and with
> > > help of hrtimer_active() observes that the timer is active (because timer callback
> > > is running) and decides not to (re-)start timer;
> > >
> > > After this first thread decides not to restart the timer and finishes callback.
> > > The thing that first thread executes tx_tick isn't helpful: for example first
> > > thread may have no messages to submit on any channel and therefore is not going
> > > to deal with timer.
> > >
> > > Finally, mailbox state machine is stalled. Second thread thinks that timer is
> > > active while it's not.
> > >
> > ... you mean race :) and we have locks for that. You want me to send
> > in a patch?
>
> We don't have separate lock for timer.
>
> > > One of the main questions is that there is only one timer per few channels
> > > in current code.
> > >
> > I see that as a good thing because
> > a) Polling anyway doesn't provide 'hard' guarantee even if we have one
> > timer per channel
> > b) The poll period remains same for every channel, so functionality
> > wise you only increase timer callbacks.
>
> Do you mean something like this below?
>
> The patch isn't really tested on multi-channel environment yet but
> I will test it. I just want to know if I am on the right way here.
>
> I know there are some adjustments that can be done in the loop in hr-timer
> callback. The thing that I don't like here is a lot of spin_lock/unlocks
> in the timer callback.
Hi Jassi,
I was able to come up with new version with only HR-timer spinlock in timer
callback. That one seems a little bit better.
The patch is below:
>From df8264ec3b1db61fdfe0f381fabb6abdecfb3ec1 Mon Sep 17 00:00:00 2001
From: Alexey Klimov <alexey.klimov@arm.com>
Date: Thu, 29 May 2017 17:40:11 +0100
Subject: [PATCH RFC] mailbox: fix hrtimer_forward() warning
When mailbox controller provides two or more channels,
timer-based polling is used by mailbox controller
and channels are actively used by mailbox client(s) it's very
easy to trigger the warning in hrtimer_forward():
[ 247.853060] WARNING: CPU: 6 PID: 0 at kernel/time/hrtimer.c:805 hrtimer_forward+0x88/0xd8
[ 247.853549] Modules linked in:
[ 247.853907] CPU: 6 PID: 0 Comm: swapper/6 Tainted: G W 4.11.0-rc2-00362-g93afaa4513bb-dirty #13
[ 247.854472] Hardware name: linux,dummy-virt (DT)
[ 247.854699] task: ffff80001d89d780 task.stack: ffff80001d8c4000
[ 247.854999] PC is at hrtimer_forward+0x88/0xd8
[ 247.855280] LR is at txdone_hrtimer+0xd4/0xf8
[ 247.855551] pc : [<ffff0000081039f0>] lr : [<ffff00000881b874>] pstate: 200001c5
[ 247.855857] sp : ffff80001efbdeb0
[ 247.856072] x29: ffff80001efbdeb0 x28: ffff80001efc3140
[ 247.856358] x27: ffff00000881b7a0 x26: 00000039ac93e8b6
[ 247.856604] x25: ffff000008e756be x24: ffff80001c4a1348
[ 247.856882] x23: 0000000000000001 x22: 00000000000000f8
[ 247.857189] x21: ffff80001c4a1318 x20: ffff80001d327110
[ 247.857509] x19: 00000000000f4240 x18: 0000000000000030
[ 247.857808] x17: 0000ffffaecdf370 x16: ffff0000081ccc80
[ 247.858000] x15: 0000000000000010 x14: 00000000fffffff0
[ 247.858186] x13: ffff000008f488e0 x12: 000000000002e3eb
[ 247.858381] x11: ffff000008979690 x10: 0000000000000000
[ 247.858573] x9 : 0000000000000001 x8 : ffff80001efc66e0
[ 247.858758] x7 : ffff80001efc6708 x6 : 00000005be7732f2
[ 247.858943] x5 : 0000000000000001 x4 : ffff80001c4a1348
[ 247.859130] x3 : 00000039ac94952a x2 : 00000000000f4240
[ 247.859315] x1 : 00000039ac98243c x0 : 0000000000038f12
[ 247.859582] ---[ end trace d61812426ec3c30b ]---
To fix it this patch introduces one more mbox spinlock for hr-timer and
hrt_active variable in controller structure.
The hr-timer callback restarts timer unless there no active requests
on any channel.
Signed-off-by: Alexey Klimov <alexey.klimov@arm.com>
---
drivers/mailbox/mailbox.c | 51 ++++++++++++++++++++++++++++++--------
include/linux/mailbox_controller.h | 2 ++
2 files changed, 42 insertions(+), 11 deletions(-)
diff --git a/drivers/mailbox/mailbox.c b/drivers/mailbox/mailbox.c
index 9dfbf7e..57ae422 100644
--- a/drivers/mailbox/mailbox.c
+++ b/drivers/mailbox/mailbox.c
@@ -85,9 +85,17 @@ static void msg_submit(struct mbox_chan *chan)
exit:
spin_unlock_irqrestore(&chan->lock, flags);
- if (!err && (chan->txdone_method & TXDONE_BY_POLL))
- /* kick start the timer immediately to avoid delays */
- hrtimer_start(&chan->mbox->poll_hrt, 0, HRTIMER_MODE_REL);
+ if (!err && (chan->txdone_method & TXDONE_BY_POLL)) {
+
+ spin_lock_irqsave(&chan->mbox->lock_hrt, flags);
+ /* try to start the timer immediately to avoid delays */
+ if (!chan->mbox->hrt_active) {
+ chan->mbox->hrt_active = 1;
+ hrtimer_start(&chan->mbox->poll_hrt, 0,
+ HRTIMER_MODE_REL);
+ }
+ spin_unlock_irqrestore(&chan->mbox->lock_hrt, flags);
+ }
}
static void tx_tick(struct mbox_chan *chan, int r)
@@ -121,23 +129,43 @@ static enum hrtimer_restart txdone_hrtimer(struct hrtimer *hrtimer)
bool txdone, resched = false;
int i;
+ spin_lock(&mbox->lock_hrt);
+
for (i = 0; i < mbox->num_chans; i++) {
struct mbox_chan *chan = &mbox->chans[i];
+ /*
+ * Reading of active_req is racy and we may miss an active
+ * request. However, the enqueuing thread will spin on the
+ * HR-timer lock, which we acquired here, and will start the
+ * timer if required.
+ */
if (chan->active_req && chan->cl) {
- txdone = chan->mbox->ops->last_tx_done(chan);
- if (txdone)
+ /*
+ * If we observe an active request on any channel, then
+ * we have to re-start the timer; if we see that
+ * a request has finished, we don't know if tx_tick()
+ * will schedule the next request or not. Therefore we
+ * have to re-sched the HR-timer here in all cases.
+ */
+ resched = true;
+
+ txdone = mbox->ops->last_tx_done(chan);
+ if (txdone) {
+ spin_unlock(&mbox->lock_hrt);
tx_tick(chan, 0);
- else
- resched = true;
+ spin_lock(&mbox->lock_hrt);
+ }
}
}
- if (resched) {
+ if (resched)
hrtimer_forward_now(hrtimer, ms_to_ktime(mbox->txpoll_period));
- return HRTIMER_RESTART;
- }
- return HRTIMER_NORESTART;
+ else
+ mbox->hrt_active = 0;
+ spin_unlock(&mbox->lock_hrt);
+
+ return resched ? HRTIMER_RESTART : HRTIMER_NORESTART;
}
/**
@@ -462,6 +490,7 @@ int mbox_controller_register(struct mbox_controller *mbox)
return -EINVAL;
}
+ spin_lock_init(&mbox->lock_hrt);
hrtimer_init(&mbox->poll_hrt, CLOCK_MONOTONIC,
HRTIMER_MODE_REL);
mbox->poll_hrt.function = txdone_hrtimer;
diff --git a/include/linux/mailbox_controller.h b/include/linux/mailbox_controller.h
index 74deadb..1dd4c47 100644
--- a/include/linux/mailbox_controller.h
+++ b/include/linux/mailbox_controller.h
@@ -83,6 +83,8 @@ struct mbox_controller {
const struct of_phandle_args *sp);
/* Internal to API */
struct hrtimer poll_hrt;
+ spinlock_t lock_hrt;
+ unsigned int hrt_active;
struct list_head node;
};
--
1.9.1
next prev parent reply other threads:[~2017-05-30 16:30 UTC|newest]
Thread overview: 7+ messages / expand[flat|nested] mbox.gz Atom feed top
2017-04-06 17:31 [PATCH RFC] mailbox: move controller timer to per-channel timers Alexey Klimov
2017-04-07 15:09 ` Jassi Brar
2017-04-11 12:34 ` Alexey Klimov
2017-04-11 13:00 ` Jassi Brar
2017-05-25 17:43 ` Alexey Klimov
2017-05-30 16:29 ` Alexey Klimov [this message]
2017-05-31 8:38 ` Jassi Brar
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20170530162956.GA4974@arm.com \
--to=alexey.klimov@arm.com \
--cc=jassisinghbrar@gmail.com \
--cc=jaswinder.singh@linaro.org \
--cc=linux-kernel@vger.kernel.org \
--cc=sudeep.holla@arm.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.