From: Gregory Haskins <ghaskins@novell.com>
To: Gilles Carry <gilles.carry@bull.net>
Cc: linux-rt-users@vger.kernel.org, rostedt@goodmis.org,
tinytim@us.ibm.com, jean-pierre.dion@bull.net,
sebastien.dugue@bull.net
Subject: Re: [PATCH replace] Fix pushable_tasks list corruption
Date: Fri, 03 Oct 2008 08:50:55 -0400 [thread overview]
Message-ID: <48E6152F.1020400@novell.com> (raw)
In-Reply-To: <1223022275-6626-1-git-send-email-gilles.carry@bull.net>
[-- Attachment #1: Type: text/plain, Size: 2800 bytes --]
Hi Gilles,
Gilles Carry wrote:
> From: gilles.carry <gilles.carry@bull.net>
>
> Symptoms:
> System hang (endless loop in plist_check_list)
> or BUG because of faulty prev/next pointers in
> pushable_task node.
>
> When push_rt_task successes finding a task to push away, it
> performs a double lock on the runqueues (local and target) but
> before getting both locks, it releases the local rq lock allowing
> other cpus grab the task in between. (eg. pull_rt_task, timers...)
> In some situations, when push_rt_task calls dequeue_pushable_task
> the task may have already been removed from the pushable_tasks
> list by another cpu. Removing the node again corrupts the list.
>
> This patch adds a sanity check to dequeue_pushable_task which only
> removes the node if it's still on the original list.
>
I think your analysis is spot on, though I am still concerned about your
implementation. It still "papers over" the issue, so to speak, which is
that the item moved to a new RQ. In addition, it adds a relatively
expensive check to the fast path of the list dequeue.
Based on this, I have submitted patches that address your finding but in
a more targeted and efficient approach (crediting you and Chirag) here:
http://lkml.org/lkml/2008/10/3/130
So I have to respectfully NACK this patch, but I do thank you for
investigating and pointing me at the actual root problem. It is
sincerely appreciated.
Regards,
-Greg
> Signed-off-by: Gilles Carry <gilles.carry@bull.net>
> Cc: ghaskins@novell.com
> ---
> kernel/sched_rt.c | 18 +++++++++++++++++-
> 1 files changed, 17 insertions(+), 1 deletions(-)
>
> diff --git a/kernel/sched_rt.c b/kernel/sched_rt.c
> index 57a0c0d..7aa4450 100644
> --- a/kernel/sched_rt.c
> +++ b/kernel/sched_rt.c
> @@ -62,7 +62,17 @@ static void enqueue_pushable_task(struct rq *rq, struct task_struct *p)
>
> static void dequeue_pushable_task(struct rq *rq, struct task_struct *p)
> {
> - plist_del(&p->pushable_tasks, &rq->rt.pushable_tasks);
> + struct plist_node *next;
> +
> + /* Sanity check: delete only if node is still on this list */
> + plist_for_each(next, &rq->rt.pushable_tasks) {
> + if (&p->pushable_tasks == next) {
> + plist_del(&p->pushable_tasks, &rq->rt.pushable_tasks);
> + return;
> + }
> + }
> +
> +
> }
>
> #else
> @@ -1105,6 +1115,12 @@ static int push_rt_task(struct rq *rq)
> * try again, since the other cpus will pull from us
> * when they are ready
> */
> +
> + /*
> + * If we reach here and the task has migrated to another cpu
> + * (paranoid == 0?) calling dequeue_pushable_task may cause
> + * pushable_tasks list corruption.
> + */
> dequeue_pushable_task(rq, next_task);
> goto out;
> }
>
[-- Attachment #2: OpenPGP digital signature --]
[-- Type: application/pgp-signature, Size: 257 bytes --]
prev parent reply other threads:[~2008-10-03 12:46 UTC|newest]
Thread overview: 2+ messages / expand[flat|nested] mbox.gz Atom feed top
2008-10-03 8:24 [PATCH replace] Fix pushable_tasks list corruption Gilles Carry
2008-10-03 12:50 ` Gregory Haskins [this message]
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=48E6152F.1020400@novell.com \
--to=ghaskins@novell.com \
--cc=gilles.carry@bull.net \
--cc=jean-pierre.dion@bull.net \
--cc=linux-rt-users@vger.kernel.org \
--cc=rostedt@goodmis.org \
--cc=sebastien.dugue@bull.net \
--cc=tinytim@us.ibm.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.