linux-rt-users.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: Gregory Haskins <ghaskins@novell.com>
To: Gilles Carry <gilles.carry@bull.net>
Cc: linux-rt-users@vger.kernel.org, rostedt@goodmis.org,
	tinytim@us.ibm.com, jean-pierre.dion@bull.net,
	sebastien.dugue@bull.net
Subject: Re: [PATCH replace] Fix pushable_tasks list corruption
Date: Fri, 03 Oct 2008 08:50:55 -0400	[thread overview]
Message-ID: <48E6152F.1020400@novell.com> (raw)
In-Reply-To: <1223022275-6626-1-git-send-email-gilles.carry@bull.net>

[-- Attachment #1: Type: text/plain, Size: 2800 bytes --]

Hi Gilles,

Gilles Carry wrote:
> From: gilles.carry <gilles.carry@bull.net>
>
> Symptoms:
> System hang (endless loop in plist_check_list)
> or BUG because of faulty prev/next pointers in
> pushable_task node.
>
> When push_rt_task successes finding a task to push away, it
> performs a double lock on the runqueues (local and target) but
> before getting both locks, it releases the local rq lock allowing
> other cpus grab the task in between. (eg. pull_rt_task, timers...)
> In some situations, when push_rt_task calls dequeue_pushable_task
> the task may have already been removed from the pushable_tasks
> list by another cpu. Removing the node again corrupts the list.
>
> This patch adds a sanity check to dequeue_pushable_task which only
> removes the node if it's still on the original list.
>   

I think your analysis is spot on, though I am still concerned about your
implementation.  It still "papers over" the issue, so to speak, which is
that the item moved to a new RQ.  In addition, it adds a relatively
expensive check to the fast path of the list dequeue.

Based on this, I have submitted patches that address your finding but in
a more targeted and efficient approach (crediting you and Chirag) here:

http://lkml.org/lkml/2008/10/3/130

So I have to respectfully NACK this patch, but I do thank you for
investigating and pointing me at the actual root problem.  It is
sincerely appreciated.

Regards,
-Greg

> Signed-off-by: Gilles Carry <gilles.carry@bull.net>
> Cc: ghaskins@novell.com
> ---
>  kernel/sched_rt.c |   18 +++++++++++++++++-
>  1 files changed, 17 insertions(+), 1 deletions(-)
>
> diff --git a/kernel/sched_rt.c b/kernel/sched_rt.c
> index 57a0c0d..7aa4450 100644
> --- a/kernel/sched_rt.c
> +++ b/kernel/sched_rt.c
> @@ -62,7 +62,17 @@ static void enqueue_pushable_task(struct rq *rq, struct task_struct *p)
>  
>  static void dequeue_pushable_task(struct rq *rq, struct task_struct *p)
>  {
> -	plist_del(&p->pushable_tasks, &rq->rt.pushable_tasks);
> +	struct plist_node *next;
> +
> +	/* Sanity check: delete only if node is still on this list */
> +	plist_for_each(next, &rq->rt.pushable_tasks) {
> +		if (&p->pushable_tasks == next) {
> +			plist_del(&p->pushable_tasks, &rq->rt.pushable_tasks);
> +			return;
> +		}
> +	}
> +
> +
>  }
>  
>  #else
> @@ -1105,6 +1115,12 @@ static int push_rt_task(struct rq *rq)
>  		 * try again, since the other cpus will pull from us
>  		 * when they are ready
>  		 */
> +
> +		/*
> +		 * If we reach here and the task has migrated to another cpu
> +		 * (paranoid == 0?) calling dequeue_pushable_task may cause
> +		 * pushable_tasks list corruption.
> +		 */
>  		dequeue_pushable_task(rq, next_task);
>  		goto out;
>  	}
>   



[-- Attachment #2: OpenPGP digital signature --]
[-- Type: application/pgp-signature, Size: 257 bytes --]

      reply	other threads:[~2008-10-03 12:46 UTC|newest]

Thread overview: 2+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2008-10-03  8:24 [PATCH replace] Fix pushable_tasks list corruption Gilles Carry
2008-10-03 12:50 ` Gregory Haskins [this message]

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=48E6152F.1020400@novell.com \
    --to=ghaskins@novell.com \
    --cc=gilles.carry@bull.net \
    --cc=jean-pierre.dion@bull.net \
    --cc=linux-rt-users@vger.kernel.org \
    --cc=rostedt@goodmis.org \
    --cc=sebastien.dugue@bull.net \
    --cc=tinytim@us.ibm.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).