All of lore.kernel.org
 help / color / mirror / Atom feed
From: Kamalesh Babulal <kamalesh-23VcF4HTsmIX0ybBhKVfKdBPR1lH4CV8@public.gmane.org>
To: Nick Piggin <nickpiggin-/E1597aS9LT0CCvOHzKKcA@public.gmane.org>
Cc: "Rafael J. Wysocki" <rjw-KKrjLPT3xs0@public.gmane.org>,
	"Paul E. McKenney"
	<paulmck-23VcF4HTsmIX0ybBhKVfKdBPR1lH4CV8@public.gmane.org>,
	Alexey Dobriyan
	<adobriyan-Re5JQEeQqe8AvxtiuMwx3w@public.gmane.org>,
	Linux Kernel Mailing List
	<linux-kernel-u79uwXL29TY76Z2rM5mHXA@public.gmane.org>,
	Adrian Bunk <bunk-DgEjT+Ai2ygdnm+yROfE0A@public.gmane.org>,
	Andrew Morton
	<akpm-de/tnXTf+JLsfHDXvbKv3WD2FQJk+8+b@public.gmane.org>,
	Linus Torvalds
	<torvalds-de/tnXTf+JLsfHDXvbKv3WD2FQJk+8+b@public.gmane.org>,
	Natalie Protasevich
	<protasnb-Re5JQEeQqe8AvxtiuMwx3w@public.gmane.org>,
	Kernel Testers List
	<kernel-testers-u79uwXL29TY76Z2rM5mHXA@public.gmane.org>
Subject: Re: 2.6.26-rc9-git4: Reported regressions from 2.6.25
Date: Thu, 10 Jul 2008 14:33:51 +0530	[thread overview]
Message-ID: <4875D077.8050109@linux.vnet.ibm.com> (raw)
In-Reply-To: <200807101725.36175.nickpiggin-/E1597aS9LT0CCvOHzKKcA@public.gmane.org>

Nick Piggin wrote:
> On Wednesday 09 July 2008 07:37, Rafael J. Wysocki wrote:
> 
>> Bug-Entry	: http://bugzilla.kernel.org/show_bug.cgi?id=11023
>> Subject		: 2.6.26-rc8-git2 - kernel BUG at mm/page_alloc.c:585
>> Submitter	: Kamalesh Babulal <kamalesh-23VcF4HTsmIX0ybBhKVfKdBPR1lH4CV8@public.gmane.org>
>> Date		: 2008-07-02 11:55 (7 days old)
>> References	: http://lkml.org/lkml/2008/7/2/32
>> Handled-By	: Andrew Morton <akpm-de/tnXTf+JLsfHDXvbKv3WD2FQJk+8+b@public.gmane.org>
> 
> I expect Andrew probably doesn't have time to delve into this. 
> Usual questions apply: is it reproduceable, is it bisectable?
> Someone at IBM is probably best to handle it. Maybe try Mel or
> powerpc list?
> 

This is reproducible, I have marked the powerpc list in the bug report,
send to the list. I will try and bisect the bug. 
> 
>> Bug-Entry	: http://bugzilla.kernel.org/show_bug.cgi?id=10906
>> Subject		: repeatable slab corruption with LTP msgctl08
>> Submitter	: Andrew Morton <akpm-de/tnXTf+JLsfHDXvbKv3WD2FQJk+8+b@public.gmane.org>
>> Date		: 2008-06-12 5:13 (27 days old)
>> References	: http://marc.info/?l=linux-kernel&m=121324775927704&w=4
>> Handled-By	: Pekka J Enberg <penberg-bbCR+/B0CizivPeTLB3BmA@public.gmane.org>
>> 		  Christoph Lameter <clameter-sJ/iWh9BUns@public.gmane.org>
>> 		  Manfred Spraul <manfred-nhLOkwUX5cPe2c5cEj3t2g@public.gmane.org>
>> 		  Andi Kleen <andi-Vw/NltI1exuRpAAqCnN02g@public.gmane.org>
> 
> I couldn't reproduce this one either. Maybe hardware failure?
> 
> 
>> Bug-Entry	: http://bugzilla.kernel.org/show_bug.cgi?id=10629
>> Subject		: 2.6.26-rc1-$sha1: RIP __d_lookup+0x8c/0x160
>> Submitter	: Alexey Dobriyan <adobriyan-Re5JQEeQqe8AvxtiuMwx3w@public.gmane.org>
>> Date		: 2008-05-05 09:59 (65 days old)
>> References	: http://lkml.org/lkml/2008/5/5/28
>> Handled-By	: Paul E. McKenney <paulmck-23VcF4HTsmIX0ybBhKVfKdBPR1lH4CV8@public.gmane.org>
> 
> Attached is my fix for this problem. I don't think it is a regression
> as such, but it can't hurt to go into 2.6.26 IMO.
> 
> 
> 
> ------------------------------------------------------------------------
> 
> PREEMPT_RCU without HOTPLUG_CPU is broken. The rcu_online_cpu is called to
> initially populate rcu_cpu_online_map with all online CPUs when the hotplug
> event handler is installed, and also to populate the map with CPUs as they
> come online. The former case is meant to happen with and without HOTPLUG_CPU,
> but without HOTPLUG_CPU, the rcu_offline_cpu function is no-oped -- while it
> still gets called, it does not set the rcu CPU map.
> 
> With a blank RCU CPU map, grace periods get to tick by completely oblivious
> to active RCU read side critical sections. This results in free-before-grace
> bugs.
> 
> Fix is obvious once the problem is known. (Also, change __devinit to
> __cpuinit so the function gets thrown away on !HOTPLUG_CPU kernels).
> 
> Signed-off-by: Nick Piggin <npiggin-l3A5Bk7waGM@public.gmane.org>
> ---
> 
> Annoyed this wasn't a crazy obscure error in the algorithm I could fix :)
> I spent all day debugging it and had to make a special test case (rcutorture
> didn't seem to trigger it), and a big RCU state logging infrastructure to log
> millions of RCU state transitions and events. Oh well.
> 
> Index: linux-2.6/kernel/rcupreempt.c
> ===================================================================
> --- linux-2.6.orig/kernel/rcupreempt.c	2008-07-10 17:08:56.000000000 +1000
> +++ linux-2.6/kernel/rcupreempt.c	2008-07-10 17:09:10.000000000 +1000
> @@ -925,26 +925,22 @@ void rcu_offline_cpu(int cpu)
>  	spin_unlock_irqrestore(&rdp->lock, flags);
>  }
> 
> -void __devinit rcu_online_cpu(int cpu)
> -{
> -	unsigned long flags;
> -
> -	spin_lock_irqsave(&rcu_ctrlblk.fliplock, flags);
> -	cpu_set(cpu, rcu_cpu_online_map);
> -	spin_unlock_irqrestore(&rcu_ctrlblk.fliplock, flags);
> -}
> -
>  #else /* #ifdef CONFIG_HOTPLUG_CPU */
> 
>  void rcu_offline_cpu(int cpu)
>  {
>  }
> 
> -void __devinit rcu_online_cpu(int cpu)
> +#endif /* #else #ifdef CONFIG_HOTPLUG_CPU */
> +
> +void __cpuinit rcu_online_cpu(int cpu)
>  {
> -}
> +	unsigned long flags;
> 
> -#endif /* #else #ifdef CONFIG_HOTPLUG_CPU */
> +	spin_lock_irqsave(&rcu_ctrlblk.fliplock, flags);
> +	cpu_set(cpu, rcu_cpu_online_map);
> +	spin_unlock_irqrestore(&rcu_ctrlblk.fliplock, flags);
> +}
> 
>  static void rcu_process_callbacks(struct softirq_action *unused)
>  {


-- 
Thanks & Regards,
Kamalesh Babulal,
Linux Technology Center,
IBM, ISTL.

WARNING: multiple messages have this Message-ID (diff)
From: Kamalesh Babulal <kamalesh@linux.vnet.ibm.com>
To: Nick Piggin <nickpiggin@yahoo.com.au>
Cc: "Rafael J. Wysocki" <rjw@sisk.pl>,
	"Paul E. McKenney" <paulmck@linux.vnet.ibm.com>,
	Alexey Dobriyan <adobriyan@gmail.com>,
	Linux Kernel Mailing List <linux-kernel@vger.kernel.org>,
	Adrian Bunk <bunk@kernel.org>,
	Andrew Morton <akpm@linux-foundation.org>,
	Linus Torvalds <torvalds@linux-foundation.org>,
	Natalie Protasevich <protasnb@gmail.com>,
	Kernel Testers List <kernel-testers@vger.kernel.org>
Subject: Re: 2.6.26-rc9-git4: Reported regressions from 2.6.25
Date: Thu, 10 Jul 2008 14:33:51 +0530	[thread overview]
Message-ID: <4875D077.8050109@linux.vnet.ibm.com> (raw)
In-Reply-To: <200807101725.36175.nickpiggin@yahoo.com.au>

Nick Piggin wrote:
> On Wednesday 09 July 2008 07:37, Rafael J. Wysocki wrote:
> 
>> Bug-Entry	: http://bugzilla.kernel.org/show_bug.cgi?id=11023
>> Subject		: 2.6.26-rc8-git2 - kernel BUG at mm/page_alloc.c:585
>> Submitter	: Kamalesh Babulal <kamalesh@linux.vnet.ibm.com>
>> Date		: 2008-07-02 11:55 (7 days old)
>> References	: http://lkml.org/lkml/2008/7/2/32
>> Handled-By	: Andrew Morton <akpm@linux-foundation.org>
> 
> I expect Andrew probably doesn't have time to delve into this. 
> Usual questions apply: is it reproduceable, is it bisectable?
> Someone at IBM is probably best to handle it. Maybe try Mel or
> powerpc list?
> 

This is reproducible, I have marked the powerpc list in the bug report,
send to the list. I will try and bisect the bug. 
> 
>> Bug-Entry	: http://bugzilla.kernel.org/show_bug.cgi?id=10906
>> Subject		: repeatable slab corruption with LTP msgctl08
>> Submitter	: Andrew Morton <akpm@linux-foundation.org>
>> Date		: 2008-06-12 5:13 (27 days old)
>> References	: http://marc.info/?l=linux-kernel&m=121324775927704&w=4
>> Handled-By	: Pekka J Enberg <penberg@cs.helsinki.fi>
>> 		  Christoph Lameter <clameter@sgi.com>
>> 		  Manfred Spraul <manfred@colorfullife.com>
>> 		  Andi Kleen <andi@firstfloor.org>
> 
> I couldn't reproduce this one either. Maybe hardware failure?
> 
> 
>> Bug-Entry	: http://bugzilla.kernel.org/show_bug.cgi?id=10629
>> Subject		: 2.6.26-rc1-$sha1: RIP __d_lookup+0x8c/0x160
>> Submitter	: Alexey Dobriyan <adobriyan@gmail.com>
>> Date		: 2008-05-05 09:59 (65 days old)
>> References	: http://lkml.org/lkml/2008/5/5/28
>> Handled-By	: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
> 
> Attached is my fix for this problem. I don't think it is a regression
> as such, but it can't hurt to go into 2.6.26 IMO.
> 
> 
> 
> ------------------------------------------------------------------------
> 
> PREEMPT_RCU without HOTPLUG_CPU is broken. The rcu_online_cpu is called to
> initially populate rcu_cpu_online_map with all online CPUs when the hotplug
> event handler is installed, and also to populate the map with CPUs as they
> come online. The former case is meant to happen with and without HOTPLUG_CPU,
> but without HOTPLUG_CPU, the rcu_offline_cpu function is no-oped -- while it
> still gets called, it does not set the rcu CPU map.
> 
> With a blank RCU CPU map, grace periods get to tick by completely oblivious
> to active RCU read side critical sections. This results in free-before-grace
> bugs.
> 
> Fix is obvious once the problem is known. (Also, change __devinit to
> __cpuinit so the function gets thrown away on !HOTPLUG_CPU kernels).
> 
> Signed-off-by: Nick Piggin <npiggin@suse.de>
> ---
> 
> Annoyed this wasn't a crazy obscure error in the algorithm I could fix :)
> I spent all day debugging it and had to make a special test case (rcutorture
> didn't seem to trigger it), and a big RCU state logging infrastructure to log
> millions of RCU state transitions and events. Oh well.
> 
> Index: linux-2.6/kernel/rcupreempt.c
> ===================================================================
> --- linux-2.6.orig/kernel/rcupreempt.c	2008-07-10 17:08:56.000000000 +1000
> +++ linux-2.6/kernel/rcupreempt.c	2008-07-10 17:09:10.000000000 +1000
> @@ -925,26 +925,22 @@ void rcu_offline_cpu(int cpu)
>  	spin_unlock_irqrestore(&rdp->lock, flags);
>  }
> 
> -void __devinit rcu_online_cpu(int cpu)
> -{
> -	unsigned long flags;
> -
> -	spin_lock_irqsave(&rcu_ctrlblk.fliplock, flags);
> -	cpu_set(cpu, rcu_cpu_online_map);
> -	spin_unlock_irqrestore(&rcu_ctrlblk.fliplock, flags);
> -}
> -
>  #else /* #ifdef CONFIG_HOTPLUG_CPU */
> 
>  void rcu_offline_cpu(int cpu)
>  {
>  }
> 
> -void __devinit rcu_online_cpu(int cpu)
> +#endif /* #else #ifdef CONFIG_HOTPLUG_CPU */
> +
> +void __cpuinit rcu_online_cpu(int cpu)
>  {
> -}
> +	unsigned long flags;
> 
> -#endif /* #else #ifdef CONFIG_HOTPLUG_CPU */
> +	spin_lock_irqsave(&rcu_ctrlblk.fliplock, flags);
> +	cpu_set(cpu, rcu_cpu_online_map);
> +	spin_unlock_irqrestore(&rcu_ctrlblk.fliplock, flags);
> +}
> 
>  static void rcu_process_callbacks(struct softirq_action *unused)
>  {


-- 
Thanks & Regards,
Kamalesh Babulal,
Linux Technology Center,
IBM, ISTL.

  parent reply	other threads:[~2008-07-10  9:03 UTC|newest]

Thread overview: 25+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2008-07-08 21:37 2.6.26-rc9-git4: Reported regressions from 2.6.25 Rafael J. Wysocki
2008-07-08 21:37 ` Rafael J. Wysocki
2008-07-09  4:49 ` Randy Dunlap
2008-07-09  4:49   ` Randy Dunlap
     [not found]   ` <20080708214903.b783ba84.randy.dunlap-QHcLZuEGTsvQT0dZR+AlfA@public.gmane.org>
2008-07-09 14:35     ` Rafael J. Wysocki
2008-07-09 14:35       ` Rafael J. Wysocki
     [not found] ` <200807101725.36175.nickpiggin@yahoo.com.au>
     [not found]   ` <200807101725.36175.nickpiggin-/E1597aS9LT0CCvOHzKKcA@public.gmane.org>
2008-07-10  9:03     ` Kamalesh Babulal [this message]
2008-07-10  9:03       ` Kamalesh Babulal
2008-07-10 11:02     ` Alexey Dobriyan
2008-07-10 11:02       ` Alexey Dobriyan
     [not found]       ` <20080710110213.GA6688-QDJVlCTZ4KWTKS93B3g+7KFoa47nwP16@public.gmane.org>
2008-07-10 17:21         ` Linus Torvalds
2008-07-10 17:21           ` Linus Torvalds
2008-07-10 17:34           ` Ingo Molnar
     [not found]             ` <20080710173459.GA11648-X9Un+BFzKDI@public.gmane.org>
2008-07-10 18:06               ` Ingo Molnar
2008-07-10 18:06                 ` Ingo Molnar
     [not found]                 ` <20080710180620.GA30844-X9Un+BFzKDI@public.gmane.org>
2008-07-11  4:11                   ` Nick Piggin
2008-07-11  4:11                     ` Nick Piggin
     [not found]                     ` <200807111412.00084.nickpiggin-/E1597aS9LT0CCvOHzKKcA@public.gmane.org>
2008-08-01 21:09                       ` Paul E. McKenney
2008-08-01 21:09                         ` Paul E. McKenney
2008-08-01 21:09                   ` Paul E. McKenney
2008-08-01 21:09                     ` Paul E. McKenney
     [not found]                 ` <20080710204157.GG6877@linux.vnet.ibm.com>
     [not found]                   ` <20080710204157.GG6877-23VcF4HTsmIX0ybBhKVfKdBPR1lH4CV8@public.gmane.org>
2008-08-01 21:09                     ` Paul E. McKenney
2008-08-01 21:09                       ` Paul E. McKenney
2008-08-01 21:09     ` Paul E. McKenney
2008-08-01 21:09       ` Paul E. McKenney

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=4875D077.8050109@linux.vnet.ibm.com \
    --to=kamalesh-23vcf4htsmix0ybbhkvfkdbpr1lh4cv8@public.gmane.org \
    --cc=adobriyan-Re5JQEeQqe8AvxtiuMwx3w@public.gmane.org \
    --cc=akpm-de/tnXTf+JLsfHDXvbKv3WD2FQJk+8+b@public.gmane.org \
    --cc=bunk-DgEjT+Ai2ygdnm+yROfE0A@public.gmane.org \
    --cc=kernel-testers-u79uwXL29TY76Z2rM5mHXA@public.gmane.org \
    --cc=linux-kernel-u79uwXL29TY76Z2rM5mHXA@public.gmane.org \
    --cc=nickpiggin-/E1597aS9LT0CCvOHzKKcA@public.gmane.org \
    --cc=paulmck-23VcF4HTsmIX0ybBhKVfKdBPR1lH4CV8@public.gmane.org \
    --cc=protasnb-Re5JQEeQqe8AvxtiuMwx3w@public.gmane.org \
    --cc=rjw-KKrjLPT3xs0@public.gmane.org \
    --cc=torvalds-de/tnXTf+JLsfHDXvbKv3WD2FQJk+8+b@public.gmane.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.