All of lore.kernel.org
 help / color / mirror / Atom feed
From: Peter Zijlstra <peterz@infradead.org>
To: "Schmid, Carsten" <Carsten_Schmid@mentor.com>
Cc: "mingo@redhat.com" <mingo@redhat.com>,
	"linux-kernel@vger.kernel.org" <linux-kernel@vger.kernel.org>,
	walken@google.com, dave@stgolabs.net
Subject: Re: Crash in fair scheduler
Date: Tue, 3 Dec 2019 15:01:53 +0100	[thread overview]
Message-ID: <20191203140153.GP2844@hirez.programming.kicks-ass.net> (raw)
In-Reply-To: <656260cf50684c11a3122aca88dde0cb@SVR-IES-MBX-03.mgc.mentorg.com>

On Tue, Dec 03, 2019 at 10:51:46AM +0000, Schmid, Carsten wrote:

> > > struct sched_entity *__pick_first_entity(struct cfs_rq *cfs_rq)
> > > {
> > > 	struct rb_node *left = rb_first_cached(&cfs_rq->tasks_timeline);
> > >
> > > 	if (!left)
> > > 		return NULL; <<<<<<<<<< the case
> > >
> > > 	return rb_entry(left, struct sched_entity, run_node);
> > > }
> > 
> > This the problem, for some reason the rbtree code got that rb_leftmost
> > thing wrecked.
> > 
> Any known issue on rbtree code regarding this?

I don't recall ever having seen this before. :/ Adding Davidlohr and
Michel who've poked at the rbtree code 'recently'.

> > > Is this a corner case nobody thought of or do we have cfs_rq data that is
> > unexpected in it's content?
> > 
> > No, the rbtree is corrupt. Your tree has a single node (which matches
> > with nr_running), but for some reason it thinks rb_leftmost is NULL.
> > This is wrong, if the tree is non-empty, it must have a leftmost
> > element.
> Is there a chance to find the left-most element in the core dump?

If there is only one entry in the tree, then that must also be the
leftmost entry. See your own later question :-)

> Maybe i can dig deeper to find the root c ause then.
> Does any of the structs/data in this context point to some memory
> where i can continue to search?

There are only two places where rb_leftmost are updated,
rb_insert_color_cached() and rb_erase_cached() (the scheduler does not
use rb_replace_nod_cached).

We can 'forget' to set leftmost on insertion if @leftmost is somehow
false, and we can eroneously clear leftmost on erase if rb_next()
malfunctions.

No clues on which of those two cases happened.

> Where should rb_leftmost point to if only one node is in the tree?
> To the node itself?

Exatly.


I suppose one approach is to add code to both __enqueue_entity() and
__dequeue_entity() that compares ->rb_leftmost to the result of
rb_first(). That'd incur some overhead but it'd double check the logic.

  reply	other threads:[~2019-12-03 14:02 UTC|newest]

Thread overview: 11+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2019-12-03  9:11 Crash in fair scheduler Schmid, Carsten
2019-12-03 10:30 ` Valentin Schneider
2019-12-03 10:40   ` Dietmar Eggemann
2019-12-03 11:09     ` Valentin Schneider
2019-12-03 15:08       ` Dietmar Eggemann
2019-12-03 15:57         ` AW: " Schmid, Carsten
2019-12-03 10:30 ` Peter Zijlstra
2019-12-03 10:51   ` AW: " Schmid, Carsten
2019-12-03 14:01     ` Peter Zijlstra [this message]
2019-12-05 10:56       ` Schmid, Carsten
2019-12-05 17:41       ` Davidlohr Bueso

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20191203140153.GP2844@hirez.programming.kicks-ass.net \
    --to=peterz@infradead.org \
    --cc=Carsten_Schmid@mentor.com \
    --cc=dave@stgolabs.net \
    --cc=linux-kernel@vger.kernel.org \
    --cc=mingo@redhat.com \
    --cc=walken@google.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.