From: "Paul E. McKenney" <paulmck@linux.vnet.ibm.com>
To: sedat.dilek@gmail.com
Cc: Stephen Rothwell <sfr@canb.auug.org.au>,
linux-next@vger.kernel.org, LKML <linux-kernel@vger.kernel.org>,
peterz@infradead.org
Subject: Re: linux-next: Tree for April 14 (Call-traces: RCU/ACPI/WQ related?)
Date: Tue, 26 Apr 2011 08:42:55 -0700 [thread overview]
Message-ID: <20110426154255.GA2135@linux.vnet.ibm.com> (raw)
In-Reply-To: <BANLkTiko8dcYQtTo2P80nk503vxNabaLPw@mail.gmail.com>
On Tue, Apr 26, 2011 at 02:50:25PM +0200, Sedat Dilek wrote:
> On Tue, Apr 26, 2011 at 2:42 PM, Paul E. McKenney
> <paulmck@linux.vnet.ibm.com> wrote:
> > On Tue, Apr 26, 2011 at 01:45:31PM +0200, Sedat Dilek wrote:
> >> On Tue, Apr 26, 2011 at 7:06 AM, Paul E. McKenney
> >> <paulmck@linux.vnet.ibm.com> wrote:
> >> > On Sun, Apr 24, 2011 at 09:43:31AM -0700, Paul E. McKenney wrote:
> >> >> On Sun, Apr 24, 2011 at 11:36:44AM +0200, Sedat Dilek wrote:
> >> >> > On Sun, Apr 24, 2011 at 8:27 AM, Paul E. McKenney
> >> >> > <paulmck@linux.vnet.ibm.com> wrote:
> >> >>
> >> >> [ . . . ]
> >> >>
> >> >> > > OK, this looks unrelated, but just in case, could you please try it
> >> >> > > again with the following patch? (Not mainlinable, debug only.)
> >> >> > >
> >> >> > > Also, it does look like you are still seeing a grace-period hang.
> >> >> > > Could you please send the output of the script? Same one as last time.
> >> >> > >
> >> >> > > Thanx, Paul
> >> >> > >
> >> >> > > ------------------------------------------------------------------------
> >> >> > >
> >> >> > > debugobjects.c | 8 +++++---
> >> >> > > 1 file changed, 5 insertions(+), 3 deletions(-)
> >> >> > >
> >> >> > > diff --git a/lib/debugobjects.c b/lib/debugobjects.c
> >> >> > > index 9d86e45..10a7c7a 100644
> >> >> > > --- a/lib/debugobjects.c
> >> >> > > +++ b/lib/debugobjects.c
> >> >> > > @@ -289,10 +289,12 @@ static void debug_object_is_on_stack(void *addr, int onstack)
> >> >> > > return;
> >> >> > >
> >> >> > > limit++;
> >> >> > > - if (is_on_stack)
> >> >> > > + if (is_on_stack) {
> >> >> > > + struct rcu_head *p = (struct rcu_head *)addr;
> >> >> > > printk(KERN_WARNING
> >> >> > > - "ODEBUG: object is on stack, but not annotated\n");
> >> >> > > - else
> >> >> > > + "ODEBUG: object is on stack, but not annotated: %p\n",
> >> >> > > + p->func);
> >> >> > > + } else
> >> >> > > printk(KERN_WARNING
> >> >> > > "ODEBUG: object is not on stack, but annotated\n");
> >> >> > > WARN_ON(1);
> >> >> > >
> >> >> >
> >> >> > Somehow your attached patch was not applicable.
> >> >> > As the changes were a few lines I applied it by myself.
> >> >> > Attached are log, dmesg and patches (orig + mine)
> >> >>
> >> >> Hmmm... Does 0xc10231a1 correspond to a function in your build? If so,
> >> >> could you please let me know which one?
> >> >>
> >> >> OK, so according to "ps" the per-CPU kthread is runnable, but it appears
> >> >> to never run. You only have one CPU, so it cannot be waiting due to
> >> >> running on the wrong CPU. The only other loop is in wait_event(), and
> >> >> that code looks good -- besides, if wait_event() was broken, we would
> >> >> be seeing breakage everywhere.
> >> >>
> >> >> Peter, any thoughts on what I might have done wrong to get the scheduler
> >> >> into a state where it was ignoring a runnable realtime task?
> >> >
> >> > Hello, Sedat,
> >> >
> >> > Here is a diagnostic patch to apply on top of sedat.2011.04.23a from
> >> > the -rcu git tree. Could you please try it out, let me know what
> >> > happens, and run the last collectdebugfs.sh during the test?
> >> >
> >> > Thanx, Paul
> >> >
> >> > ------------------------------------------------------------------------
> >> >
> >> > diff --git a/kernel/rcutree.c b/kernel/rcutree.c
> >> > index 6cf6e47..65ae701 100644
> >> > --- a/kernel/rcutree.c
> >> > +++ b/kernel/rcutree.c
> >> > @@ -1524,9 +1524,9 @@ static void rcu_cpu_kthread_setrt(int cpu, int to_rt)
> >> > return;
> >> > if (to_rt) {
> >> > policy = SCHED_NORMAL;
> >> > - sp.sched_priority = RCU_KTHREAD_PRIO;
> >> > + sp.sched_priority = 0;
> >> > } else {
> >> > - policy = SCHED_FIFO;
> >> > + policy = SCHED_NORMAL;
> >> > sp.sched_priority = 0;
> >> > }
> >> > sched_setscheduler_nocheck(t, policy, &sp);
> >> > @@ -1566,8 +1566,8 @@ static void rcu_yield(void (*f)(unsigned long), unsigned long arg)
> >> > sp.sched_priority = 0;
> >> > sched_setscheduler_nocheck(current, SCHED_NORMAL, &sp);
> >> > schedule();
> >> > - sp.sched_priority = RCU_KTHREAD_PRIO;
> >> > - sched_setscheduler_nocheck(current, SCHED_FIFO, &sp);
> >> > + sp.sched_priority = 0;
> >> > + sched_setscheduler_nocheck(current, SCHED_NORMAL, &sp);
> >> > del_timer(&yield_timer);
> >> > }
> >> >
> >> > @@ -1671,8 +1671,8 @@ static int __cpuinit rcu_spawn_one_cpu_kthread(int cpu)
> >> > WARN_ON_ONCE(per_cpu(rcu_cpu_kthread_task, cpu) != NULL);
> >> > per_cpu(rcu_cpu_kthread_task, cpu) = t;
> >> > wake_up_process(t);
> >> > - sp.sched_priority = RCU_KTHREAD_PRIO;
> >> > - sched_setscheduler_nocheck(t, SCHED_FIFO, &sp);
> >> > + sp.sched_priority = 0;
> >> > + sched_setscheduler_nocheck(t, SCHED_NORMAL, &sp);
> >> > return 0;
> >> > }
> >> >
> >> > @@ -1713,8 +1713,8 @@ static int rcu_node_kthread(void *arg)
> >> > continue;
> >> > }
> >> > per_cpu(rcu_cpu_has_work, cpu) = 1;
> >> > - sp.sched_priority = RCU_KTHREAD_PRIO;
> >> > - sched_setscheduler_nocheck(t, SCHED_FIFO, &sp);
> >> > + sp.sched_priority = 0;
> >> > + sched_setscheduler_nocheck(t, SCHED_NORMAL, &sp);
> >> > preempt_enable();
> >> > }
> >> > }
> >> > diff --git a/kernel/rcutree_plugin.h b/kernel/rcutree_plugin.h
> >> > index a21413d..baee185 100644
> >> > --- a/kernel/rcutree_plugin.h
> >> > +++ b/kernel/rcutree_plugin.h
> >> > @@ -1307,8 +1307,8 @@ static int __cpuinit rcu_spawn_one_boost_kthread(struct rcu_state *rsp,
> >> > rnp->boost_kthread_task = t;
> >> > raw_spin_unlock_irqrestore(&rnp->lock, flags);
> >> > wake_up_process(t);
> >> > - sp.sched_priority = RCU_KTHREAD_PRIO;
> >> > - sched_setscheduler_nocheck(t, SCHED_FIFO, &sp);
> >> > + sp.sched_priority = 0;
> >> > + sched_setscheduler_nocheck(t, SCHED_NORMAL, &sp);
> >> > return 0;
> >> > }
> >> >
> >> >
> >>
> >> Hi Paul,
> >>
> >> I have tested with your patch and kept the kernel-config file from
> >> previous tests (don't get confused by the new name).
> >> Hope this helps you.
> >>
> >> I have some questions to k-c options espcially X86_UP and
> >> CONFIG_RCU_FANOUT=32 options.
> >> To what extent can they influence our RCU issue?
> >> The below options were not set for this round of testing, but I would
> >> like to have a feedback.
> >> Thanks in advance.
> >>
> >> Would these settings be more optimal for a UP-machine?
> >>
> >> # CONFIG_SMP is not set
> >> # CONFIG_M486 is not set
> >> CONFIG_M686=y
> >> CONFIG_NR_CPUS=1
> >
> > These should be fine.
> >
> >> CONFIG_X86_UP_APIC=y
> >> CONFIG_X86_UP_IOAPIC=y
> >
> > These I don't know about.
> >
> >> CONFIG_HIGHMEM4G=y
> >
> > This one seems good for allowing the system to go as long as possible.
> >
> >> Is CONFIG_RCU_FANOUT=32 OK?
> >
> > On a UP system, this one doesn't matter.
> >
> >> With reverting commit 687d7a960aea46e016182c7ce346d62c4dbd0366 ("rcu:
> >> restrict TREE_RCU to SMP builds with !PREEMPT").
> >
> > Thank you for trying this one out!
> >
> > I don't see any sign of a grace-period hang. Did your test complete
> > correctly?
> >
> > Thanx, Paul
> >
>
> Thanks for the comments.
>
> I let run the script very long (approx. one hour) and did parallelly
> my daily work.
> Then booted into a known as working kernel.
> Did I miss something, should I stress more?
I wouldn't know -- I never have been able to reproduce this.
For the moment, I will do my inspections assuming that the bug
has something to do with realtime priority.
Thank you again for your testing!
Thanx, Paul
next prev parent reply other threads:[~2011-04-26 18:24 UTC|newest]
Thread overview: 30+ messages / expand[flat|nested] mbox.gz Atom feed top
2011-04-14 8:59 linux-next: Tree for April 14 (Call-traces: RCU/ACPI/WQ related?) Sedat Dilek
2011-04-14 9:16 ` Sedat Dilek
2011-04-14 10:19 ` Sedat Dilek
2011-04-14 22:19 ` Sedat Dilek
2011-04-14 22:44 ` Paul E. McKenney
2011-04-21 5:08 ` Paul E. McKenney
2011-04-21 9:07 ` Sedat Dilek
2011-04-21 10:24 ` Sedat Dilek
2011-04-21 12:49 ` Sedat Dilek
2011-04-21 14:28 ` Paul E. McKenney
2011-04-21 14:47 ` Sedat Dilek
2011-04-22 0:50 ` Paul E. McKenney
2011-04-22 9:40 ` Sedat Dilek
2011-04-22 15:02 ` Paul E. McKenney
2011-04-22 17:36 ` Sedat Dilek
2011-04-23 21:05 ` Paul E. McKenney
2011-04-23 21:16 ` Sedat Dilek
2011-04-23 23:04 ` Sedat Dilek
2011-04-23 23:08 ` Sedat Dilek
2011-04-24 6:27 ` Paul E. McKenney
2011-04-24 9:36 ` Sedat Dilek
2011-04-24 16:43 ` Paul E. McKenney
2011-04-26 5:06 ` Paul E. McKenney
2011-04-26 11:45 ` Sedat Dilek
2011-04-26 12:42 ` Paul E. McKenney
2011-04-26 12:50 ` Sedat Dilek
2011-04-26 15:42 ` Paul E. McKenney [this message]
2011-04-26 10:06 ` Peter Zijlstra
2011-04-26 11:31 ` Paul E. McKenney
2011-04-26 19:44 ` Paul E. McKenney
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20110426154255.GA2135@linux.vnet.ibm.com \
--to=paulmck@linux.vnet.ibm.com \
--cc=linux-kernel@vger.kernel.org \
--cc=linux-next@vger.kernel.org \
--cc=peterz@infradead.org \
--cc=sedat.dilek@gmail.com \
--cc=sfr@canb.auug.org.au \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.