public inbox for linux-kernel@vger.kernel.org
 help / color / mirror / Atom feed
* Re: + convert-cpu-hotplug-notifiers-to-use-raw_notifier-instead-of-blocking_notifier.patch added to -mm tree
       [not found] <200610130506.k9D56YJY031111@shell0.pdx.osdl.net>
@ 2006-10-13 15:19 ` Michal Piotrowski
  2006-10-13 18:41   ` Andrew Morton
  0 siblings, 1 reply; 6+ messages in thread
From: Michal Piotrowski @ 2006-10-13 15:19 UTC (permalink / raw)
  To: akpm; +Cc: neilb, michal.k.k.piotrowski, rusty, LKML

Hi,

akpm@osdl.org wrote:
> The patch titled
> 
>      Convert cpu hotplug notifiers to use raw_notifier instead of blocking_notifier
> 
> has been added to the -mm tree.  Its filename is
> 
>      convert-cpu-hotplug-notifiers-to-use-raw_notifier-instead-of-blocking_notifier.patch
> 
> See http://www.zip.com.au/~akpm/linux/patches/stuff/added-to-mm.txt to find
> out what to do about this
> 
> ------------------------------------------------------
> Subject: Convert cpu hotplug notifiers to use raw_notifier instead of blocking_notifier
> From: Neil Brown <neilb@suse.de>
> 
> The use of blocking notifier by _cpu_up and _cpu_down in cpu.c has two
> problem.
> 
> 1/ An interaction with the workqueue notifier causes lockdep to spit a
>    warning.
> 
> 2/ A notifier could conceivable be added or removed while _cpu_up or
>    _cpu_down are in process.  As each notifier is called twice (prepare
>    then commit/abort) this could be unhealthy.
> 
> To fix to we simply take cpu_add_remove_lock while adding or removing
> notifiers to/from the list.
> 
> This makes the 'blocking' usage unnecessary as all accesses to cpu_chain
> are now protected by cpu_add_remove_lock.  So change "blocking" to "raw" in
> all relevant places.  This fixes 1.
> 

There is something really wrong with this patch (or my hardware).

echo shutdown > /sys/power/disk; echo disk > /sys/power/state
works fine for me on 2.6.19-rc1-g8770c018.

On 2.6.19-rc1-mm1 +
convert-cpu-hotplug-notifiers-to-use-raw_notifier-instead-of-blocking_notifier.patch
+ Neil's avoid_lockdep_warning_in_md.patch
(http://www.ussg.iu.edu/hypermail/linux/kernel/0610.1/0642.html)
I get a lot of "end_request: I/O error, dev sda, sector 31834343" messages.

I checked sda with badblocks and everything seems to be fine
/sbin/badblocks -o /root/sda.badblocks -v /dev/sda
Sprawdzanie bloków od 0 do 156290904
Poszukiwanie wadliwych bloków (tylko odczyt): done
Przebieg zakończony, znaleziono 0 wadliwych bloków.
(/root/sda.badblocks is empty)

BTW. sda is a new Seagate SATA II HDD atached to ICH5 (SATA I) controller.

Regards,
Michal

-- 
Michal K. K. Piotrowski
LTG - Linux Testers Group
(http://www.stardust.webpages.pl/ltg/)

^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: + convert-cpu-hotplug-notifiers-to-use-raw_notifier-instead-of-blocking_notifier.patch added to -mm tree
  2006-10-13 15:19 ` + convert-cpu-hotplug-notifiers-to-use-raw_notifier-instead-of-blocking_notifier.patch added to -mm tree Michal Piotrowski
@ 2006-10-13 18:41   ` Andrew Morton
  2006-10-13 20:13     ` Michal Piotrowski
  0 siblings, 1 reply; 6+ messages in thread
From: Andrew Morton @ 2006-10-13 18:41 UTC (permalink / raw)
  To: Michal Piotrowski; +Cc: neilb, rusty, LKML

On Fri, 13 Oct 2006 17:19:16 +0200
Michal Piotrowski <michal.k.k.piotrowski@gmail.com> wrote:

> There is something really wrong with this patch (or my hardware).
> 
> echo shutdown > /sys/power/disk; echo disk > /sys/power/state
> works fine for me on 2.6.19-rc1-g8770c018.
> 
> On 2.6.19-rc1-mm1 +
> convert-cpu-hotplug-notifiers-to-use-raw_notifier-instead-of-blocking_notifier.patch
> + Neil's avoid_lockdep_warning_in_md.patch
> (http://www.ussg.iu.edu/hypermail/linux/kernel/0610.1/0642.html)
> I get a lot of "end_request: I/O error, dev sda, sector 31834343" messages.

That's not exactly an expected result.  What makes you think it's due to
this patch?  Does 2.6.19-rc1-mm1 run OK?


^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: + convert-cpu-hotplug-notifiers-to-use-raw_notifier-instead-of-blocking_notifier.patch added to -mm tree
  2006-10-13 18:41   ` Andrew Morton
@ 2006-10-13 20:13     ` Michal Piotrowski
  2006-10-13 20:43       ` Andrew Morton
  2006-10-14  6:55       ` Andrew Morton
  0 siblings, 2 replies; 6+ messages in thread
From: Michal Piotrowski @ 2006-10-13 20:13 UTC (permalink / raw)
  To: Andrew Morton; +Cc: neilb, rusty, LKML, Thomas Gleixner

On 13/10/06, Andrew Morton <akpm@osdl.org> wrote:
> On Fri, 13 Oct 2006 17:19:16 +0200
> Michal Piotrowski <michal.k.k.piotrowski@gmail.com> wrote:
>
> > There is something really wrong with this patch (or my hardware).
> >
> > echo shutdown > /sys/power/disk; echo disk > /sys/power/state
> > works fine for me on 2.6.19-rc1-g8770c018.
> >
> > On 2.6.19-rc1-mm1 +
> > convert-cpu-hotplug-notifiers-to-use-raw_notifier-instead-of-blocking_notifier.patch
> > + Neil's avoid_lockdep_warning_in_md.patch
> > (http://www.ussg.iu.edu/hypermail/linux/kernel/0610.1/0642.html)
> > I get a lot of "end_request: I/O error, dev sda, sector 31834343" messages.
>
> That's not exactly an expected result.  What makes you think it's due to
> this patch?  Does 2.6.19-rc1-mm1 run OK?

Yes. (the only one issue is
http://www.stardust.webpages.pl/files/tbf/euridica/2.6.19-rc1-mm1/mm-dmesg)

I get many "random" bugs that avoid hibernation with this patch.
Unfortunately I can't catch backtraces (broken sysklogd, lack of
serial console).

Here is the only one bug that I can reproduce and copy (by hand)

---
BUG: bad unlock balance detected!
---
klogd/1751 is trying release lock (&cpu_base->lock_key) at:
[<c0137a3d>] hrtimer_sched_tick
but there are no more locks to release!
other info that might help us debug this:
1 lock held by klogd/1751
#0 (&cpu_base->lock_key#2){++..}, at [<c0137f2f>] hrtimer_interupt
Stack backtrace:
c01042eb dump_stack_trace
c0104461 show_trace_log
c0104a08 show_trace
c0104a4f dump_stack
c013abdb print_unlock_irqbalance_bug
c013c781 lock_release_non_nested
c013cc08 lock_release
c013575f _spin_unlock
c0137a3d hrtimer_sched_tick
c0137fc5 hrtimer_interupt
c0113d00 smp_apic_timer_interupt
c0103d56 apic_timer_interupt

DWARF2 unwinder stuck at apic_timer_interupt

Stopping tasks: ========
BUG spinlock lockup on CPU#0, aio/1/179/c742a940
c01042e6 dump_trace
c0104461 show_trace_log_lvl
c0104a08 show_trace
c0104a4f dump_stack
c0201c91 _raw__spin_lock
c03155ab _spin_lock_irqsave
c013708f lock_hrtimer_base
c013711b hrtimer_try_to_cancel
c0137182 hrtimer_cancel
c01246e2 do_exit
c0103f39 kernel_thread_helper


l *0xc0137a3d
0xc0137a3d is in hrtimer_sched_tick
(/mnt/md0/devel/linux-mm/kernel/hrtimer.c:780).
775                      * update_process_times() might take
tasklist_lock, hence
776                      * drop the base lock. sched-tick hrtimers are
per-CPU and
777                      * never accessible by userspace APIs, so this
is safe to do.
778                      */
779                     spin_unlock(&cpu_base->lock);
780                     update_process_times(user_mode(regs));
781                     profile_tick(CPU_PROFILING);
782                     spin_lock(&cpu_base->lock);
783             }
784

l *0xc0137f2f
0xc0137f2f is in hrtimer_interrupt
(/mnt/md0/devel/linux-mm/kernel/hrtimer.c:1389).
1384                    ktime_t basenow;
1385                    struct rb_node *node;
1386
1387                    spin_lock(&cpu_base->lock);
1388
1389                    basenow = ktime_add(now, base->offset);
1390
1391                    while ((node = base->first)) {
1392                            struct hrtimer *timer;
1393

Regards,
Michal

-- 
Michal K. K. Piotrowski
LTG - Linux Testers Group
(http://www.stardust.webpages.pl/ltg/)

^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: + convert-cpu-hotplug-notifiers-to-use-raw_notifier-instead-of-blocking_notifier.patch added to -mm tree
  2006-10-13 20:13     ` Michal Piotrowski
@ 2006-10-13 20:43       ` Andrew Morton
  2006-10-14  6:55       ` Andrew Morton
  1 sibling, 0 replies; 6+ messages in thread
From: Andrew Morton @ 2006-10-13 20:43 UTC (permalink / raw)
  To: Michal Piotrowski; +Cc: neilb, rusty, LKML, Thomas Gleixner

On Fri, 13 Oct 2006 22:13:43 +0200
"Michal Piotrowski" <michal.k.k.piotrowski@gmail.com> wrote:

> On 13/10/06, Andrew Morton <akpm@osdl.org> wrote:
> > On Fri, 13 Oct 2006 17:19:16 +0200
> > Michal Piotrowski <michal.k.k.piotrowski@gmail.com> wrote:
> >
> > > There is something really wrong with this patch (or my hardware).
> > >
> > > echo shutdown > /sys/power/disk; echo disk > /sys/power/state
> > > works fine for me on 2.6.19-rc1-g8770c018.
> > >
> > > On 2.6.19-rc1-mm1 +
> > > convert-cpu-hotplug-notifiers-to-use-raw_notifier-instead-of-blocking_notifier.patch
> > > + Neil's avoid_lockdep_warning_in_md.patch
> > > (http://www.ussg.iu.edu/hypermail/linux/kernel/0610.1/0642.html)
> > > I get a lot of "end_request: I/O error, dev sda, sector 31834343" messages.
> >
> > That's not exactly an expected result.  What makes you think it's due to
> > this patch?  Does 2.6.19-rc1-mm1 run OK?
> 
> Yes. (the only one issue is
> http://www.stardust.webpages.pl/files/tbf/euridica/2.6.19-rc1-mm1/mm-dmesg)
> 
> I get many "random" bugs that avoid hibernation with this patch.
> Unfortunately I can't catch backtraces (broken sysklogd, lack of
> serial console).
> 
> Here is the only one bug that I can reproduce and copy (by hand)
> 
> ---
> BUG: bad unlock balance detected!
> ---
> klogd/1751 is trying release lock (&cpu_base->lock_key) at:
> [<c0137a3d>] hrtimer_sched_tick
> but there are no more locks to release!
> other info that might help us debug this:
> 1 lock held by klogd/1751

<rewiews the patch again>

How completely bizarre.  I've no idea what's going on.  I'll see if I can
reproduce it later today (fat chance).

Meanwhile, let's blame Rusty.

^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: + convert-cpu-hotplug-notifiers-to-use-raw_notifier-instead-of-blocking_notifier.patch added to -mm tree
  2006-10-13 20:13     ` Michal Piotrowski
  2006-10-13 20:43       ` Andrew Morton
@ 2006-10-14  6:55       ` Andrew Morton
  2006-10-14 11:53         ` Michal Piotrowski
  1 sibling, 1 reply; 6+ messages in thread
From: Andrew Morton @ 2006-10-14  6:55 UTC (permalink / raw)
  To: Michal Piotrowski; +Cc: neilb, rusty, LKML, Thomas Gleixner

On Fri, 13 Oct 2006 22:13:43 +0200
"Michal Piotrowski" <michal.k.k.piotrowski@gmail.com> wrote:

> On 13/10/06, Andrew Morton <akpm@osdl.org> wrote:
> > On Fri, 13 Oct 2006 17:19:16 +0200
> > Michal Piotrowski <michal.k.k.piotrowski@gmail.com> wrote:
> >
> > > There is something really wrong with this patch (or my hardware).
> > >
> > > echo shutdown > /sys/power/disk; echo disk > /sys/power/state
> > > works fine for me on 2.6.19-rc1-g8770c018.
> > >
> > > On 2.6.19-rc1-mm1 +
> > > convert-cpu-hotplug-notifiers-to-use-raw_notifier-instead-of-blocking_notifier.patch
> > > + Neil's avoid_lockdep_warning_in_md.patch
> > > (http://www.ussg.iu.edu/hypermail/linux/kernel/0610.1/0642.html)
> > > I get a lot of "end_request: I/O error, dev sda, sector 31834343" messages.
> >
> > That's not exactly an expected result.  What makes you think it's due to
> > this patch?  Does 2.6.19-rc1-mm1 run OK?
> 
> Yes. (the only one issue is
> http://www.stardust.webpages.pl/files/tbf/euridica/2.6.19-rc1-mm1/mm-dmesg)
> 
> I get many "random" bugs that avoid hibernation with this patch.

As predicted, it works for me.  In fact it makes a string of nasty SMP-only
warnings go away.

^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: + convert-cpu-hotplug-notifiers-to-use-raw_notifier-instead-of-blocking_notifier.patch added to -mm tree
  2006-10-14  6:55       ` Andrew Morton
@ 2006-10-14 11:53         ` Michal Piotrowski
  0 siblings, 0 replies; 6+ messages in thread
From: Michal Piotrowski @ 2006-10-14 11:53 UTC (permalink / raw)
  To: Andrew Morton; +Cc: Michal Piotrowski, neilb, rusty, LKML, Thomas Gleixner

Andrew Morton wrote:
> On Fri, 13 Oct 2006 22:13:43 +0200
> "Michal Piotrowski" <michal.k.k.piotrowski@gmail.com> wrote:
> 
>> On 13/10/06, Andrew Morton <akpm@osdl.org> wrote:
>>> On Fri, 13 Oct 2006 17:19:16 +0200
>>> Michal Piotrowski <michal.k.k.piotrowski@gmail.com> wrote:
>>>
>>>> There is something really wrong with this patch (or my hardware).
>>>>
>>>> echo shutdown > /sys/power/disk; echo disk > /sys/power/state
>>>> works fine for me on 2.6.19-rc1-g8770c018.
>>>>
>>>> On 2.6.19-rc1-mm1 +
>>>> convert-cpu-hotplug-notifiers-to-use-raw_notifier-instead-of-blocking_notifier.patch
>>>> + Neil's avoid_lockdep_warning_in_md.patch
>>>> (http://www.ussg.iu.edu/hypermail/linux/kernel/0610.1/0642.html)
>>>> I get a lot of "end_request: I/O error, dev sda, sector 31834343" messages.
>>> That's not exactly an expected result.  What makes you think it's due to
>>> this patch?  Does 2.6.19-rc1-mm1 run OK?
>> Yes. (the only one issue is
>> http://www.stardust.webpages.pl/files/tbf/euridica/2.6.19-rc1-mm1/mm-dmesg)
>>
>> I get many "random" bugs that avoid hibernation with this patch.
> 
> As predicted, it works for me.  In fact it makes a string of nasty SMP-only
> warnings go away.
> 

I reverted Neil's avoid_lockdep_warning_in_md.patch and everything works fine.

2.6.19-rc1-mm1 + avoid_lockdep_warning_in_md.patch works well for me.
2.6.19-rc1-mm1 + convert-cpu-hotplug-notifiers-to-use-raw_notifier-instead-of-blocking_notifier.patch also works well.

2.6.19-rc1-mm1 + both patches = crashing hibernation.

Regards,
Michal

-- 
Michal K. K. Piotrowski
LTG - Linux Testers Group
(http://www.stardust.webpages.pl/ltg/)

-----------------------
Avoid lockdep warning in md.

md_open takes ->reconfig_mutex which causes lockdep to complain.
This (normally) doesn't have deadlock potential as the possible
conflict is with a reconfig_mutex in a different device.

I say "normally" because if a loop were created in the array->member
hierarchy a deadlock could happen.  However that causes bigger
problems than a deadlock and should be fixed independently.

So we flag the lock in md_open as a nested lock.  This requires
defining mutex_lock_interruptible_nested.

Signed-off-by: Neil Brown <neilb@suse.de>

### Diffstat output
 ./drivers/md/md.c       |    2 +-
 ./include/linux/mutex.h |    3 ++-
 ./kernel/mutex.c        |    8 ++++++++
 3 files changed, 11 insertions(+), 2 deletions(-)

diff .prev/drivers/md/md.c ./drivers/md/md.c
--- .prev/drivers/md/md.c	2006-10-09 14:25:11.000000000 +1000
+++ ./drivers/md/md.c	2006-10-10 12:28:35.000000000 +1000
@@ -4422,7 +4422,7 @@ static int md_open(struct inode *inode,
 	mddev_t *mddev = inode->i_bdev->bd_disk->private_data;
 	int err;

-	if ((err = mddev_lock(mddev)))
+	if ((err = mutex_lock_interruptible_nested(&mddev->reconfig_mutex, 1)))
 		goto out;

 	err = 0;

diff .prev/include/linux/mutex.h ./include/linux/mutex.h
--- .prev/include/linux/mutex.h	2006-10-10 12:37:04.000000000 +1000
+++ ./include/linux/mutex.h	2006-10-10 12:40:20.000000000 +1000
@@ -125,8 +125,9 @@ extern int fastcall mutex_lock_interrupt

 #ifdef CONFIG_DEBUG_LOCK_ALLOC
 extern void mutex_lock_nested(struct mutex *lock, unsigned int subclass);
+extern int mutex_lock_interruptible_nested(struct mutex *lock, unsigned int subclass);
 #else
-# define mutex_lock_nested(lock, subclass) mutex_lock(lock)
+# define mutex_lock_interruptible_nested(lock, subclass) mutex_interruptible_lock(lock)
 #endif

 /*

diff .prev/kernel/mutex.c ./kernel/mutex.c
--- .prev/kernel/mutex.c	2006-10-10 12:35:54.000000000 +1000
+++ ./kernel/mutex.c	2006-10-10 13:20:04.000000000 +1000
@@ -206,6 +206,14 @@ mutex_lock_nested(struct mutex *lock, un
 }

 EXPORT_SYMBOL_GPL(mutex_lock_nested);
+int __sched
+mutex_lock_interruptible_nested(struct mutex *lock, unsigned int subclass)
+{
+	might_sleep();
+	return __mutex_lock_common(lock, TASK_INTERRUPTIBLE, subclass);
+}
+
+EXPORT_SYMBOL_GPL(mutex_lock_interruptible_nested);
 #endif

 /*

^ permalink raw reply	[flat|nested] 6+ messages in thread

end of thread, other threads:[~2006-10-14 11:53 UTC | newest]

Thread overview: 6+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
     [not found] <200610130506.k9D56YJY031111@shell0.pdx.osdl.net>
2006-10-13 15:19 ` + convert-cpu-hotplug-notifiers-to-use-raw_notifier-instead-of-blocking_notifier.patch added to -mm tree Michal Piotrowski
2006-10-13 18:41   ` Andrew Morton
2006-10-13 20:13     ` Michal Piotrowski
2006-10-13 20:43       ` Andrew Morton
2006-10-14  6:55       ` Andrew Morton
2006-10-14 11:53         ` Michal Piotrowski

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox