public inbox for linux-kernel@vger.kernel.org
 help / color / mirror / Atom feed
* [BUG] NULL pointer deref with rcutorture
@ 2009-01-02 11:18 Eric Sesterhenn
  2009-01-02 17:58 ` Paul E. McKenney
  0 siblings, 1 reply; 27+ messages in thread
From: Eric Sesterhenn @ 2009-01-02 11:18 UTC (permalink / raw)
  To: linux-kernel; +Cc: josh, dipankar

Hi,

modprobing rcutorture and removing it again gives me the following oops:

[  122.873859] rcu-torture:--- Start of test: nreaders=2 nfakewriters=4 stat_interval=0 verbose=0 test_no_idle_hz=0 shuffle_interval=3 stutter=5 irqreader=1
[  162.987060] BUG: unable to handle kernel NULL pointer dereference at 00000000
[  162.987194] IP: [<00000000>] 0x0
[  162.987268] *pde = 00000000 
[  162.987339] Oops: 0000 [#1] PREEMPT DEBUG_PAGEALLOC
[  162.987499] last sysfs file: /sys/devices/pci0000:00/0000:00:01.0/0000:01:00.0/resource
[  162.987534] Modules linked in: [last unloaded: rcutorture]
[  162.987667] 
[  162.987699] Pid: 5091, comm: udevd Tainted: G        W  (2.6.28-04980-gb58602a-dirty #154) System Name
[  162.987732] EIP: 0060:[<00000000>] EFLAGS: 00010246 CPU: 0
[  162.987761] EIP is at 0x0
[  162.987849] EAX: d0afcbf0 EBX: 00000000 ECX: c0160f5a EDX: 00000006
[  162.987880] ESI: d0afcbf0 EDI: 0000001c EBP: c0b01fe0 ESP: c0b01fd4
[  162.987910]  DS: 007b ES: 007b FS: 0000 GS: 0000 SS: 0068
[  162.987941] Process udevd (pid: 5091, ti=c0b01000 task=c8cc4100 task.ti=c8d20000)
[  162.988027] Stack:
[  162.988027]  c0160f61 00000200 00000001 c0b01ff8 c012a723 0000000a c8d20e4c 00000046
[  162.988027]  c012a6b4 c8d20e5c c01042ce
[  162.988027] Call Trace:
[  162.988027]  [<c0160f61>] ? rcu_process_callbacks+0x65/0x79
[  162.988027]  [<c012a723>] ? __do_softirq+0x6f/0xf6
[  162.988027]  [<c012a6b4>] ? __do_softirq+0x0/0xf6
[  162.988027]  <IRQ> <0> [<c012a631>] ? irq_exit+0x40/0x7c
[  162.988027]  [<c0110b91>] ? smp_apic_timer_interrupt+0x68/0x73
[  162.988027]  [<c0103521>] ? apic_timer_interrupt+0x2d/0x34
[  162.988027]  [<c017af71>] ? unmap_vmas+0x308/0x47f
[  162.988027]  [<c017afa2>] ? unmap_vmas+0x339/0x47f
[  162.988027]  [<c0143db5>] ? trace_hardirqs_on_caller+0x17/0x15a
[  162.988027]  [<c017e326>] ? exit_mmap+0x83/0x104
[  162.988027]  [<c01244b8>] ? mmput+0x39/0x89
[  162.988027]  [<c01278de>] ? exit_mm+0xc3/0xcb
[  162.988027]  [<c0128b99>] ? do_exit+0x199/0x6d5
[  162.988027]  [<c04e56b0>] ? trace_hardirqs_on_thunk+0xc/0x10
[  162.988027]  [<c0129137>] ? do_group_exit+0x62/0x88
[  162.988027]  [<c0129171>] ? sys_exit_group+0x14/0x16
[  162.988027]  [<c0102ea1>] ? sysenter_do_call+0x12/0x31
[  162.988027] Code:  Bad EIP value.
[  162.988027] EIP: [<00000000>] 0x0 SS:ESP 0068:c0b01fd4
[  162.990524] ---[ end trace 4eaa2a86a8e2da22 ]---
[  162.990555] Kernel panic - not syncing: Fatal exception in interrupt


Kernel is current -git with a unrelated
networking patch (http://marc.info/?l=linux-netdev&m=123088641023335&w=2)

root@whiterabbit:/usr/src/linux# grep RCU .config
# CONFIG_CLASSIC_RCU is not set
# CONFIG_TREE_RCU is not set
CONFIG_PREEMPT_RCU=y
CONFIG_RCU_TRACE=y
# CONFIG_TREE_RCU_TRACE is not set
CONFIG_PREEMPT_RCU_TRACE=y
CONFIG_RCU_TORTURE_TEST=m

Greetings, Eric

^ permalink raw reply	[flat|nested] 27+ messages in thread

* Re: [BUG] NULL pointer deref with rcutorture
  2009-01-02 11:18 [BUG] NULL pointer deref with rcutorture Eric Sesterhenn
@ 2009-01-02 17:58 ` Paul E. McKenney
  2009-01-02 18:53   ` Kamalesh Babulal
  0 siblings, 1 reply; 27+ messages in thread
From: Paul E. McKenney @ 2009-01-02 17:58 UTC (permalink / raw)
  To: Eric Sesterhenn; +Cc: linux-kernel, josh, dipankar, kamalesh

On Fri, Jan 02, 2009 at 12:18:19PM +0100, Eric Sesterhenn wrote:
> Hi,
> 
> modprobing rcutorture and removing it again gives me the following oops:

This looks similar to something that Kamalesh ran into a few weeks ago.

Kamalesh, did the fixes for that make it to 2.6.28?

							Thanx, Paul

> [  122.873859] rcu-torture:--- Start of test: nreaders=2 nfakewriters=4 stat_interval=0 verbose=0 test_no_idle_hz=0 shuffle_interval=3 stutter=5 irqreader=1
> [  162.987060] BUG: unable to handle kernel NULL pointer dereference at 00000000
> [  162.987194] IP: [<00000000>] 0x0
> [  162.987268] *pde = 00000000 
> [  162.987339] Oops: 0000 [#1] PREEMPT DEBUG_PAGEALLOC
> [  162.987499] last sysfs file: /sys/devices/pci0000:00/0000:00:01.0/0000:01:00.0/resource
> [  162.987534] Modules linked in: [last unloaded: rcutorture]
> [  162.987667] 
> [  162.987699] Pid: 5091, comm: udevd Tainted: G        W  (2.6.28-04980-gb58602a-dirty #154) System Name
> [  162.987732] EIP: 0060:[<00000000>] EFLAGS: 00010246 CPU: 0
> [  162.987761] EIP is at 0x0
> [  162.987849] EAX: d0afcbf0 EBX: 00000000 ECX: c0160f5a EDX: 00000006
> [  162.987880] ESI: d0afcbf0 EDI: 0000001c EBP: c0b01fe0 ESP: c0b01fd4
> [  162.987910]  DS: 007b ES: 007b FS: 0000 GS: 0000 SS: 0068
> [  162.987941] Process udevd (pid: 5091, ti=c0b01000 task=c8cc4100 task.ti=c8d20000)
> [  162.988027] Stack:
> [  162.988027]  c0160f61 00000200 00000001 c0b01ff8 c012a723 0000000a c8d20e4c 00000046
> [  162.988027]  c012a6b4 c8d20e5c c01042ce
> [  162.988027] Call Trace:
> [  162.988027]  [<c0160f61>] ? rcu_process_callbacks+0x65/0x79
> [  162.988027]  [<c012a723>] ? __do_softirq+0x6f/0xf6
> [  162.988027]  [<c012a6b4>] ? __do_softirq+0x0/0xf6
> [  162.988027]  <IRQ> <0> [<c012a631>] ? irq_exit+0x40/0x7c
> [  162.988027]  [<c0110b91>] ? smp_apic_timer_interrupt+0x68/0x73
> [  162.988027]  [<c0103521>] ? apic_timer_interrupt+0x2d/0x34
> [  162.988027]  [<c017af71>] ? unmap_vmas+0x308/0x47f
> [  162.988027]  [<c017afa2>] ? unmap_vmas+0x339/0x47f
> [  162.988027]  [<c0143db5>] ? trace_hardirqs_on_caller+0x17/0x15a
> [  162.988027]  [<c017e326>] ? exit_mmap+0x83/0x104
> [  162.988027]  [<c01244b8>] ? mmput+0x39/0x89
> [  162.988027]  [<c01278de>] ? exit_mm+0xc3/0xcb
> [  162.988027]  [<c0128b99>] ? do_exit+0x199/0x6d5
> [  162.988027]  [<c04e56b0>] ? trace_hardirqs_on_thunk+0xc/0x10
> [  162.988027]  [<c0129137>] ? do_group_exit+0x62/0x88
> [  162.988027]  [<c0129171>] ? sys_exit_group+0x14/0x16
> [  162.988027]  [<c0102ea1>] ? sysenter_do_call+0x12/0x31
> [  162.988027] Code:  Bad EIP value.
> [  162.988027] EIP: [<00000000>] 0x0 SS:ESP 0068:c0b01fd4
> [  162.990524] ---[ end trace 4eaa2a86a8e2da22 ]---
> [  162.990555] Kernel panic - not syncing: Fatal exception in interrupt
> 
> 
> Kernel is current -git with a unrelated
> networking patch (http://marc.info/?l=linux-netdev&m=123088641023335&w=2)
> 
> root@whiterabbit:/usr/src/linux# grep RCU .config
> # CONFIG_CLASSIC_RCU is not set
> # CONFIG_TREE_RCU is not set
> CONFIG_PREEMPT_RCU=y
> CONFIG_RCU_TRACE=y
> # CONFIG_TREE_RCU_TRACE is not set
> CONFIG_PREEMPT_RCU_TRACE=y
> CONFIG_RCU_TORTURE_TEST=m
> 
> Greetings, Eric
> --
> To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
> the body of a message to majordomo@vger.kernel.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html
> Please read the FAQ at  http://www.tux.org/lkml/

^ permalink raw reply	[flat|nested] 27+ messages in thread

* Re: [BUG] NULL pointer deref with rcutorture
  2009-01-02 17:58 ` Paul E. McKenney
@ 2009-01-02 18:53   ` Kamalesh Babulal
  2009-01-02 19:53     ` Paul E. McKenney
  0 siblings, 1 reply; 27+ messages in thread
From: Kamalesh Babulal @ 2009-01-02 18:53 UTC (permalink / raw)
  To: Paul E. McKenney; +Cc: Eric Sesterhenn, linux-kernel, josh, dipankar

* Paul E. McKenney <paulmck@linux.vnet.ibm.com> [2009-01-02 09:58:53]:

> On Fri, Jan 02, 2009 at 12:18:19PM +0100, Eric Sesterhenn wrote:
> > Hi,
> > 
> > modprobing rcutorture and removing it again gives me the following oops:
> 
> This looks similar to something that Kamalesh ran into a few weeks ago.
> 
> Kamalesh, did the fixes for that make it to 2.6.28?
> 
> 							Thanx, Paul
> 

Hi Paul,

	The fix was available in linux-next from Dec 23 2008 (commit 392ddc32982a5c661dd90dd49a3cb37f1c68b782
'bio: add support for inlining a number of bio_vecs inside the bio') and it has be pushed for merger in 2.6.29.

> > [  122.873859] rcu-torture:--- Start of test: nreaders=2 nfakewriters=4 stat_interval=0 verbose=0 test_no_idle_hz=0 shuffle_interval=3 stutter=5 irqreader=1
> > [  162.987060] BUG: unable to handle kernel NULL pointer dereference at 00000000
> > [  162.987194] IP: [<00000000>] 0x0
> > [  162.987268] *pde = 00000000 
> > [  162.987339] Oops: 0000 [#1] PREEMPT DEBUG_PAGEALLOC
> > [  162.987499] last sysfs file: /sys/devices/pci0000:00/0000:00:01.0/0000:01:00.0/resource
> > [  162.987534] Modules linked in: [last unloaded: rcutorture]
> > [  162.987667] 
> > [  162.987699] Pid: 5091, comm: udevd Tainted: G        W  (2.6.28-04980-gb58602a-dirty #154) System Name
> > [  162.987732] EIP: 0060:[<00000000>] EFLAGS: 00010246 CPU: 0
> > [  162.987761] EIP is at 0x0
> > [  162.987849] EAX: d0afcbf0 EBX: 00000000 ECX: c0160f5a EDX: 00000006
> > [  162.987880] ESI: d0afcbf0 EDI: 0000001c EBP: c0b01fe0 ESP: c0b01fd4
> > [  162.987910]  DS: 007b ES: 007b FS: 0000 GS: 0000 SS: 0068
> > [  162.987941] Process udevd (pid: 5091, ti=c0b01000 task=c8cc4100 task.ti=c8d20000)
> > [  162.988027] Stack:
> > [  162.988027]  c0160f61 00000200 00000001 c0b01ff8 c012a723 0000000a c8d20e4c 00000046
> > [  162.988027]  c012a6b4 c8d20e5c c01042ce
> > [  162.988027] Call Trace:
> > [  162.988027]  [<c0160f61>] ? rcu_process_callbacks+0x65/0x79
> > [  162.988027]  [<c012a723>] ? __do_softirq+0x6f/0xf6
> > [  162.988027]  [<c012a6b4>] ? __do_softirq+0x0/0xf6
> > [  162.988027]  <IRQ> <0> [<c012a631>] ? irq_exit+0x40/0x7c
> > [  162.988027]  [<c0110b91>] ? smp_apic_timer_interrupt+0x68/0x73
> > [  162.988027]  [<c0103521>] ? apic_timer_interrupt+0x2d/0x34
> > [  162.988027]  [<c017af71>] ? unmap_vmas+0x308/0x47f
> > [  162.988027]  [<c017afa2>] ? unmap_vmas+0x339/0x47f
> > [  162.988027]  [<c0143db5>] ? trace_hardirqs_on_caller+0x17/0x15a
> > [  162.988027]  [<c017e326>] ? exit_mmap+0x83/0x104
> > [  162.988027]  [<c01244b8>] ? mmput+0x39/0x89
> > [  162.988027]  [<c01278de>] ? exit_mm+0xc3/0xcb
> > [  162.988027]  [<c0128b99>] ? do_exit+0x199/0x6d5
> > [  162.988027]  [<c04e56b0>] ? trace_hardirqs_on_thunk+0xc/0x10
> > [  162.988027]  [<c0129137>] ? do_group_exit+0x62/0x88
> > [  162.988027]  [<c0129171>] ? sys_exit_group+0x14/0x16
> > [  162.988027]  [<c0102ea1>] ? sysenter_do_call+0x12/0x31
> > [  162.988027] Code:  Bad EIP value.
> > [  162.988027] EIP: [<00000000>] 0x0 SS:ESP 0068:c0b01fd4
> > [  162.990524] ---[ end trace 4eaa2a86a8e2da22 ]---
> > [  162.990555] Kernel panic - not syncing: Fatal exception in interrupt
> > 
> > 
> > Kernel is current -git with a unrelated
> > networking patch (http://marc.info/?l=linux-netdev&m=123088641023335&w=2)
> > 
> > root@whiterabbit:/usr/src/linux# grep RCU .config
> > # CONFIG_CLASSIC_RCU is not set
> > # CONFIG_TREE_RCU is not set
> > CONFIG_PREEMPT_RCU=y
> > CONFIG_RCU_TRACE=y
> > # CONFIG_TREE_RCU_TRACE is not set
> > CONFIG_PREEMPT_RCU_TRACE=y
> > CONFIG_RCU_TORTURE_TEST=m
> > 
> > Greetings, Eric
> > --
> > To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
> > the body of a message to majordomo@vger.kernel.org
> > More majordomo info at  http://vger.kernel.org/majordomo-info.html
> > Please read the FAQ at  http://www.tux.org/lkml/

-- 
Thanks & Regards,
Kamalesh Babulal,
Linux Technology Center,
IBM, ISTL.

^ permalink raw reply	[flat|nested] 27+ messages in thread

* Re: [BUG] NULL pointer deref with rcutorture
  2009-01-02 18:53   ` Kamalesh Babulal
@ 2009-01-02 19:53     ` Paul E. McKenney
  2009-01-02 23:12       ` Eric Sesterhenn
  0 siblings, 1 reply; 27+ messages in thread
From: Paul E. McKenney @ 2009-01-02 19:53 UTC (permalink / raw)
  To: Kamalesh Babulal; +Cc: Eric Sesterhenn, linux-kernel, josh, dipankar

On Sat, Jan 03, 2009 at 12:23:03AM +0530, Kamalesh Babulal wrote:
> * Paul E. McKenney <paulmck@linux.vnet.ibm.com> [2009-01-02 09:58:53]:
> 
> > On Fri, Jan 02, 2009 at 12:18:19PM +0100, Eric Sesterhenn wrote:
> > > Hi,
> > > 
> > > modprobing rcutorture and removing it again gives me the following oops:
> > 
> > This looks similar to something that Kamalesh ran into a few weeks ago.
> > 
> > Kamalesh, did the fixes for that make it to 2.6.28?
> > 
> > 							Thanx, Paul
> > 
> 
> Hi Paul,
> 
> 	The fix was available in linux-next from Dec 23 2008 (commit 392ddc32982a5c661dd90dd49a3cb37f1c68b782
> 'bio: add support for inlining a number of bio_vecs inside the bio') and it has be pushed for merger in 2.6.29.

Thank you, Kamalesh!

Eric, could you please try applying the above commit and see if the
problem reproduces?

							Thanx, Paul

> > > [  122.873859] rcu-torture:--- Start of test: nreaders=2 nfakewriters=4 stat_interval=0 verbose=0 test_no_idle_hz=0 shuffle_interval=3 stutter=5 irqreader=1
> > > [  162.987060] BUG: unable to handle kernel NULL pointer dereference at 00000000
> > > [  162.987194] IP: [<00000000>] 0x0
> > > [  162.987268] *pde = 00000000 
> > > [  162.987339] Oops: 0000 [#1] PREEMPT DEBUG_PAGEALLOC
> > > [  162.987499] last sysfs file: /sys/devices/pci0000:00/0000:00:01.0/0000:01:00.0/resource
> > > [  162.987534] Modules linked in: [last unloaded: rcutorture]
> > > [  162.987667] 
> > > [  162.987699] Pid: 5091, comm: udevd Tainted: G        W  (2.6.28-04980-gb58602a-dirty #154) System Name
> > > [  162.987732] EIP: 0060:[<00000000>] EFLAGS: 00010246 CPU: 0
> > > [  162.987761] EIP is at 0x0
> > > [  162.987849] EAX: d0afcbf0 EBX: 00000000 ECX: c0160f5a EDX: 00000006
> > > [  162.987880] ESI: d0afcbf0 EDI: 0000001c EBP: c0b01fe0 ESP: c0b01fd4
> > > [  162.987910]  DS: 007b ES: 007b FS: 0000 GS: 0000 SS: 0068
> > > [  162.987941] Process udevd (pid: 5091, ti=c0b01000 task=c8cc4100 task.ti=c8d20000)
> > > [  162.988027] Stack:
> > > [  162.988027]  c0160f61 00000200 00000001 c0b01ff8 c012a723 0000000a c8d20e4c 00000046
> > > [  162.988027]  c012a6b4 c8d20e5c c01042ce
> > > [  162.988027] Call Trace:
> > > [  162.988027]  [<c0160f61>] ? rcu_process_callbacks+0x65/0x79
> > > [  162.988027]  [<c012a723>] ? __do_softirq+0x6f/0xf6
> > > [  162.988027]  [<c012a6b4>] ? __do_softirq+0x0/0xf6
> > > [  162.988027]  <IRQ> <0> [<c012a631>] ? irq_exit+0x40/0x7c
> > > [  162.988027]  [<c0110b91>] ? smp_apic_timer_interrupt+0x68/0x73
> > > [  162.988027]  [<c0103521>] ? apic_timer_interrupt+0x2d/0x34
> > > [  162.988027]  [<c017af71>] ? unmap_vmas+0x308/0x47f
> > > [  162.988027]  [<c017afa2>] ? unmap_vmas+0x339/0x47f
> > > [  162.988027]  [<c0143db5>] ? trace_hardirqs_on_caller+0x17/0x15a
> > > [  162.988027]  [<c017e326>] ? exit_mmap+0x83/0x104
> > > [  162.988027]  [<c01244b8>] ? mmput+0x39/0x89
> > > [  162.988027]  [<c01278de>] ? exit_mm+0xc3/0xcb
> > > [  162.988027]  [<c0128b99>] ? do_exit+0x199/0x6d5
> > > [  162.988027]  [<c04e56b0>] ? trace_hardirqs_on_thunk+0xc/0x10
> > > [  162.988027]  [<c0129137>] ? do_group_exit+0x62/0x88
> > > [  162.988027]  [<c0129171>] ? sys_exit_group+0x14/0x16
> > > [  162.988027]  [<c0102ea1>] ? sysenter_do_call+0x12/0x31
> > > [  162.988027] Code:  Bad EIP value.
> > > [  162.988027] EIP: [<00000000>] 0x0 SS:ESP 0068:c0b01fd4
> > > [  162.990524] ---[ end trace 4eaa2a86a8e2da22 ]---
> > > [  162.990555] Kernel panic - not syncing: Fatal exception in interrupt
> > > 
> > > 
> > > Kernel is current -git with a unrelated
> > > networking patch (http://marc.info/?l=linux-netdev&m=123088641023335&w=2)
> > > 
> > > root@whiterabbit:/usr/src/linux# grep RCU .config
> > > # CONFIG_CLASSIC_RCU is not set
> > > # CONFIG_TREE_RCU is not set
> > > CONFIG_PREEMPT_RCU=y
> > > CONFIG_RCU_TRACE=y
> > > # CONFIG_TREE_RCU_TRACE is not set
> > > CONFIG_PREEMPT_RCU_TRACE=y
> > > CONFIG_RCU_TORTURE_TEST=m
> > > 
> > > Greetings, Eric
> > > --
> > > To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
> > > the body of a message to majordomo@vger.kernel.org
> > > More majordomo info at  http://vger.kernel.org/majordomo-info.html
> > > Please read the FAQ at  http://www.tux.org/lkml/
> 
> -- 
> Thanks & Regards,
> Kamalesh Babulal,
> Linux Technology Center,
> IBM, ISTL.

^ permalink raw reply	[flat|nested] 27+ messages in thread

* Re: [BUG] NULL pointer deref with rcutorture
  2009-01-02 19:53     ` Paul E. McKenney
@ 2009-01-02 23:12       ` Eric Sesterhenn
  2009-01-03  1:57         ` Paul E. McKenney
  0 siblings, 1 reply; 27+ messages in thread
From: Eric Sesterhenn @ 2009-01-02 23:12 UTC (permalink / raw)
  To: Paul E. McKenney; +Cc: Kamalesh Babulal, linux-kernel, josh, dipankar

* Paul E. McKenney (paulmck@linux.vnet.ibm.com) wrote:
> On Sat, Jan 03, 2009 at 12:23:03AM +0530, Kamalesh Babulal wrote:
> > * Paul E. McKenney <paulmck@linux.vnet.ibm.com> [2009-01-02 09:58:53]:
> > 
> > > On Fri, Jan 02, 2009 at 12:18:19PM +0100, Eric Sesterhenn wrote:
> > > > Hi,
> > > > 
> > > > modprobing rcutorture and removing it again gives me the following oops:
> > > 
> > > This looks similar to something that Kamalesh ran into a few weeks ago.
> > > 
> > > Kamalesh, did the fixes for that make it to 2.6.28?
> > > 
> > > 							Thanx, Paul
> > > 
> > 
> > Hi Paul,
> > 
> > 	The fix was available in linux-next from Dec 23 2008 (commit 392ddc32982a5c661dd90dd49a3cb37f1c68b782
> > 'bio: add support for inlining a number of bio_vecs inside the bio') and it has be pushed for merger in 2.6.29.
> 
> Thank you, Kamalesh!
> 
> Eric, could you please try applying the above commit and see if the
> problem reproduces?

I tried to apply the patch from
http://git.kernel.dk/?p=linux-2.6-block.git;a=commitdiff_plain;h=392ddc32982a5c661dd90dd49a3cb37f1c68b782;hp=bb799ca0202a360fa74d5f17039b9100caebdde7
but get a message that it is already applied. Seems i already got that
one.

Greetings, Eric

^ permalink raw reply	[flat|nested] 27+ messages in thread

* Re: [BUG] NULL pointer deref with rcutorture
  2009-01-02 23:12       ` Eric Sesterhenn
@ 2009-01-03  1:57         ` Paul E. McKenney
       [not found]           ` <20090103094003.GA6149@alice>
  0 siblings, 1 reply; 27+ messages in thread
From: Paul E. McKenney @ 2009-01-03  1:57 UTC (permalink / raw)
  To: Eric Sesterhenn; +Cc: Kamalesh Babulal, linux-kernel, josh, dipankar

On Sat, Jan 03, 2009 at 12:12:39AM +0100, Eric Sesterhenn wrote:
> * Paul E. McKenney (paulmck@linux.vnet.ibm.com) wrote:
> > On Sat, Jan 03, 2009 at 12:23:03AM +0530, Kamalesh Babulal wrote:
> > > * Paul E. McKenney <paulmck@linux.vnet.ibm.com> [2009-01-02 09:58:53]:
> > > 
> > > > On Fri, Jan 02, 2009 at 12:18:19PM +0100, Eric Sesterhenn wrote:
> > > > > Hi,
> > > > > 
> > > > > modprobing rcutorture and removing it again gives me the following oops:
> > > > 
> > > > This looks similar to something that Kamalesh ran into a few weeks ago.
> > > > 
> > > > Kamalesh, did the fixes for that make it to 2.6.28?
> > > > 
> > > > 							Thanx, Paul
> > > > 
> > > 
> > > Hi Paul,
> > > 
> > > 	The fix was available in linux-next from Dec 23 2008 (commit 392ddc32982a5c661dd90dd49a3cb37f1c68b782
> > > 'bio: add support for inlining a number of bio_vecs inside the bio') and it has be pushed for merger in 2.6.29.
> > 
> > Thank you, Kamalesh!
> > 
> > Eric, could you please try applying the above commit and see if the
> > problem reproduces?
> 
> I tried to apply the patch from
> http://git.kernel.dk/?p=linux-2.6-block.git;a=commitdiff_plain;h=392ddc32982a5c661dd90dd49a3cb37f1c68b782;hp=bb799ca0202a360fa74d5f17039b9100caebdde7
> but get a message that it is already applied. Seems i already got that
> one.

Is your configuration able to support ftrace?

Looks like someone is (wrongly) freeing some memory that was sent to
call_rcu(), and ftrace might be one way to locate the problem.

							Thanx, Paul

^ permalink raw reply	[flat|nested] 27+ messages in thread

* Re: [BUG] NULL pointer deref with rcutorture
       [not found]             ` <20090104013254.GG6958@linux.vnet.ibm.com>
@ 2009-01-04 14:57               ` Eric Sesterhenn
  2009-01-04 21:13                 ` Paul E. McKenney
  0 siblings, 1 reply; 27+ messages in thread
From: Eric Sesterhenn @ 2009-01-04 14:57 UTC (permalink / raw)
  To: Paul E. McKenney; +Cc: Kamalesh Babulal, linux-kernel, josh, dipankar

* Paul E. McKenney (paulmck@linux.vnet.ibm.com) wrote:
> On Sat, Jan 03, 2009 at 10:40:03AM +0100, Eric Sesterhenn wrote:
> > * Paul E. McKenney (paulmck@linux.vnet.ibm.com) wrote:
> > > On Sat, Jan 03, 2009 at 12:12:39AM +0100, Eric Sesterhenn wrote:
> > > > * Paul E. McKenney (paulmck@linux.vnet.ibm.com) wrote:
> > > > 
> > > > I tried to apply the patch from
> > > > http://git.kernel.dk/?p=linux-2.6-block.git;a=commitdiff_plain;h=392ddc32982a5c661dd90dd49a3cb37f1c68b782;hp=bb799ca0202a360fa74d5f17039b9100caebdde7
> > > > but get a message that it is already applied. Seems i already got that
> > > > one.
> > > 
> > > Is your configuration able to support ftrace?
> > 
> > yes
> > 
> > > Looks like someone is (wrongly) freeing some memory that was sent to
> > > call_rcu(), and ftrace might be one way to locate the problem.
> > 
> > I set the filter to call_rcu() and call_rcu_sched(), reproduced the
> > oops, and got this in the trace file.
> 
> Hmmm...  The only unique ones are _rcu_barrier() used during module
> removal by rcutorture and rt_worker_func(), which occurs near the
> beginning of the trace.  My next step would be to trace the addresses of
> rcu_head structures passed to call_rcu() and friends and to also trace
> the addresses of the structures as they are invoked (where the original
> NULL-pointer exception occurred).
> 
> Is this tracing something you would be interested in taking on?  My
> test setup is having some trouble, so it would probably be a couple of
> days before I got it working.  :-/

Just tell me what i need to do, I am not really familiar with ftrace.
I am only able to test 2.6.28-04980-gb58602a, since current -git is not
able to boot on this box :|

Greetings, Eric

^ permalink raw reply	[flat|nested] 27+ messages in thread

* Re: [BUG] NULL pointer deref with rcutorture
  2009-01-04 14:57               ` Eric Sesterhenn
@ 2009-01-04 21:13                 ` Paul E. McKenney
  2009-01-04 23:38                   ` Eric Sesterhenn
  0 siblings, 1 reply; 27+ messages in thread
From: Paul E. McKenney @ 2009-01-04 21:13 UTC (permalink / raw)
  To: Eric Sesterhenn; +Cc: Kamalesh Babulal, linux-kernel, josh, dipankar

On Sun, Jan 04, 2009 at 03:57:26PM +0100, Eric Sesterhenn wrote:
> * Paul E. McKenney (paulmck@linux.vnet.ibm.com) wrote:
> > On Sat, Jan 03, 2009 at 10:40:03AM +0100, Eric Sesterhenn wrote:
> > > * Paul E. McKenney (paulmck@linux.vnet.ibm.com) wrote:
> > > > On Sat, Jan 03, 2009 at 12:12:39AM +0100, Eric Sesterhenn wrote:
> > > > > * Paul E. McKenney (paulmck@linux.vnet.ibm.com) wrote:
> > > > > 
> > > > > I tried to apply the patch from
> > > > > http://git.kernel.dk/?p=linux-2.6-block.git;a=commitdiff_plain;h=392ddc32982a5c661dd90dd49a3cb37f1c68b782;hp=bb799ca0202a360fa74d5f17039b9100caebdde7
> > > > > but get a message that it is already applied. Seems i already got that
> > > > > one.
> > > > 
> > > > Is your configuration able to support ftrace?
> > > 
> > > yes
> > > 
> > > > Looks like someone is (wrongly) freeing some memory that was sent to
> > > > call_rcu(), and ftrace might be one way to locate the problem.
> > > 
> > > I set the filter to call_rcu() and call_rcu_sched(), reproduced the
> > > oops, and got this in the trace file.
> > 
> > Hmmm...  The only unique ones are _rcu_barrier() used during module
> > removal by rcutorture and rt_worker_func(), which occurs near the
> > beginning of the trace.  My next step would be to trace the addresses of
> > rcu_head structures passed to call_rcu() and friends and to also trace
> > the addresses of the structures as they are invoked (where the original
> > NULL-pointer exception occurred).
> > 
> > Is this tracing something you would be interested in taking on?  My
> > test setup is having some trouble, so it would probably be a couple of
> > days before I got it working.  :-/
> 
> Just tell me what i need to do, I am not really familiar with ftrace.
> I am only able to test 2.6.28-04980-gb58602a, since current -git is not
> able to boot on this box :|

Very cool!

The idea is to have __call_rcu() in kernel/rcutree.c record the
address of the callback (argument "head") and the function (argument
"func").  In rcu_do_batch(), just before invoking list->func(list),
also record the address of the callback ("list") and the function
(again, "func").

The new ftrace package has some mechanisms for doing this, but there is
always the old-fashioned way of using printk(), for example in
rcu_do_batch():

	prefetch(next);
	if (rcu_dump_callbacks)
		printk("rcu_head=%p, func=%p\n", list, func);
	list->func(list);

Initialize rcu_dump_callbacks to zero, then use a small kernel module
(or some such) to set it to one just before running your test.

If you would like, I could rough out the code, though as noted earlier,
I currently have no way of testing it.  (Hopefully will be fixed in a
few days.)

I was wondering what should be done for ftrace plugins for RCU, I guess
I now have at least part of the answer.  ;-)

							Thanx, Paul

^ permalink raw reply	[flat|nested] 27+ messages in thread

* Re: [BUG] NULL pointer deref with rcutorture
  2009-01-04 21:13                 ` Paul E. McKenney
@ 2009-01-04 23:38                   ` Eric Sesterhenn
  2009-01-05  2:28                     ` Paul E. McKenney
  0 siblings, 1 reply; 27+ messages in thread
From: Eric Sesterhenn @ 2009-01-04 23:38 UTC (permalink / raw)
  To: Paul E. McKenney; +Cc: Kamalesh Babulal, linux-kernel, josh, dipankar

hi,

* Paul E. McKenney (paulmck@linux.vnet.ibm.com) wrote:
> On Sun, Jan 04, 2009 at 03:57:26PM +0100, Eric Sesterhenn wrote:
> > 
> > Just tell me what i need to do, I am not really familiar with ftrace.
> > I am only able to test 2.6.28-04980-gb58602a, since current -git is not
> > able to boot on this box :|
> 
> Very cool!
> 
> The idea is to have __call_rcu() in kernel/rcutree.c record the
> address of the callback (argument "head") and the function (argument
> "func").  In rcu_do_batch(), just before invoking list->func(list),
> also record the address of the callback ("list") and the function
> (again, "func").
> 
> The new ftrace package has some mechanisms for doing this, but there is
> always the old-fashioned way of using printk(), for example in
> rcu_do_batch():
> 
> 	prefetch(next);
> 	if (rcu_dump_callbacks)
> 		printk("rcu_head=%p, func=%p\n", list, func);
> 	list->func(list);
> 
> Initialize rcu_dump_callbacks to zero, then use a small kernel module
> (or some such) to set it to one just before running your test.

i did it via the ugly printk and captured it via netconsole

for the box with rcutree you can find the log here:
http://www.cccmz.de/~snakebyte/200.log
The interesting part seems start at 155.858923, thats when I load
the rcutorture module

with http://www.cccmz.de/~snakebyte/200_1.log i had another try,
actually saw a rcu_do_batch: rcu_head=d1907720, func=(null) in the traces,
box went into unresponsive mode afterwards

And just in case it helps somehow: http://www.cccmz.de/~snakebyte/200_3.log

for the one with rcupreemt you can find the log here:
http://www.cccmz.de/~snakebyte/201.log
rcutorture is loaded at 87.312399

Greetings, Eric

^ permalink raw reply	[flat|nested] 27+ messages in thread

* Re: [BUG] NULL pointer deref with rcutorture
  2009-01-04 23:38                   ` Eric Sesterhenn
@ 2009-01-05  2:28                     ` Paul E. McKenney
  2009-01-05 12:14                       ` Eric Sesterhenn
  0 siblings, 1 reply; 27+ messages in thread
From: Paul E. McKenney @ 2009-01-05  2:28 UTC (permalink / raw)
  To: Eric Sesterhenn; +Cc: Kamalesh Babulal, linux-kernel, josh, dipankar

On Mon, Jan 05, 2009 at 12:38:55AM +0100, Eric Sesterhenn wrote:
> hi,
> 
> * Paul E. McKenney (paulmck@linux.vnet.ibm.com) wrote:
> > On Sun, Jan 04, 2009 at 03:57:26PM +0100, Eric Sesterhenn wrote:
> > > 
> > > Just tell me what i need to do, I am not really familiar with ftrace.
> > > I am only able to test 2.6.28-04980-gb58602a, since current -git is not
> > > able to boot on this box :|
> > 
> > Very cool!
> > 
> > The idea is to have __call_rcu() in kernel/rcutree.c record the
> > address of the callback (argument "head") and the function (argument
> > "func").  In rcu_do_batch(), just before invoking list->func(list),
> > also record the address of the callback ("list") and the function
> > (again, "func").
> > 
> > The new ftrace package has some mechanisms for doing this, but there is
> > always the old-fashioned way of using printk(), for example in
> > rcu_do_batch():
> > 
> > 	prefetch(next);
> > 	if (rcu_dump_callbacks)
> > 		printk("rcu_head=%p, func=%p\n", list, func);
> > 	list->func(list);
> > 
> > Initialize rcu_dump_callbacks to zero, then use a small kernel module
> > (or some such) to set it to one just before running your test.
> 
> i did it via the ugly printk and captured it via netconsole

Good stuff!!!

> for the box with rcutree you can find the log here:
> http://www.cccmz.de/~snakebyte/200.log
> The interesting part seems start at 155.858923, thats when I load
> the rcutorture module

When I download this, I see an 82.100963 immediately followed by an
167.894017, no 155.858923.  Do I have the right log?

> with http://www.cccmz.de/~snakebyte/200_1.log i had another try,
> actually saw a rcu_do_batch: rcu_head=d1907720, func=(null) in the traces,
> box went into unresponsive mode afterwards

Interesting.  The original function is 0xd1902ad0, passed to __call_rcu()
at 157.364214 -- would you be able to tell me what function that
corresponds to?  (Looks to me like a module, perhaps rcutorture?)

Whatever function it is, the rcu_head definitely got corrupted some
time during the 8 milliseconds or so that the callback was waiting for
a grace period.  However, it did just fine being invoked several times
beforehand -- this is a very popular RCU callback function, it appears.

My guess is that the ensuing CPU 0 stall messages are due to CPU 0
having a heart attack trying to execute at address 0.

Are you swapping over NFS, or is the netpoll_send_udp() just garbage on
the stack?

> And just in case it helps somehow: http://www.cccmz.de/~snakebyte/200_3.log

Hmmm...  The main recent change to kernel/rcutorture.c is the addition
of a reboot notifier, so that rcutorture can shut itself down gracefully
should the system go down while rcutorture is still running.  This panic
occurs in rcu_torture_stutter(), which is shown below:

	static int
	rcu_torture_stutter(void *arg)
	{
		VERBOSE_PRINTK_STRING("rcu_torture_stutter task started");
		do {
			schedule_timeout_interruptible(stutter * HZ);
			stutter_pause_test = 1;
			if (!kthread_should_stop() && !fullstop)
				schedule_timeout_interruptible(stutter * HZ);
			stutter_pause_test = 0;
		} while (!kthread_should_stop() && !fullstop);
		VERBOSE_PRINTK_STRING("rcu_torture_stutter task stopping");
		return 0;
	}

I don't see any reason why this function would transfer control to
location 0x60, though I do see a bug in the new shutdown code.  I
don't expect this to make any difference, but a patch is included
nevertheless.

> for the one with rcupreemt you can find the log here:
> http://www.cccmz.de/~snakebyte/201.log
> rcutorture is loaded at 87.312399

Hmmm...  Offset 0x60 seems to be a common thread.  This lets
rcu_torture_ops off the hook, since it isn't that large.

Anyway, could you give the attached patch a go, even though I cannot see
how it would help?  ;-)

Do you get the same error running Classic RCU?

							Thanx, Paul

> Greetings, Eric

Fix an rcutorture bug that prevents the shutdown notifier from ever
actually having any effect, due to the fact that kthreads ignore all
signals.

Signed-off-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
---

 rcutorture.c |   18 ++++++++----------
 1 file changed, 8 insertions(+), 10 deletions(-)

diff --git a/kernel/rcutorture.c b/kernel/rcutorture.c
index b310655..d54111c 100644
--- a/kernel/rcutorture.c
+++ b/kernel/rcutorture.c
@@ -136,7 +136,7 @@ static int stutter_pause_test = 0;
 #endif
 int rcutorture_runnable = RCUTORTURE_RUNNABLE_INIT;
 
-#define FULLSTOP_SIGNALED 1	/* Bail due to signal. */
+#define FULLSTOP_SHUTDOWN 1	/* Bail due to system shutdown/panic. */
 #define FULLSTOP_CLEANUP  2	/* Orderly shutdown. */
 static int fullstop;		/* stop generating callbacks at test end. */
 DEFINE_MUTEX(fullstop_mutex);	/* protect fullstop transitions and */
@@ -151,12 +151,10 @@ rcutorture_shutdown_notify(struct notifier_block *unused1,
 {
 	if (fullstop)
 		return NOTIFY_DONE;
-	if (signal_pending(current)) {
-		mutex_lock(&fullstop_mutex);
-		if (!ACCESS_ONCE(fullstop))
-			fullstop = FULLSTOP_SIGNALED;
-		mutex_unlock(&fullstop_mutex);
-	}
+	mutex_lock(&fullstop_mutex);
+	if (!fullstop)
+		fullstop = FULLSTOP_SHUTDOWN;
+	mutex_unlock(&fullstop_mutex);
 	return NOTIFY_DONE;
 }
 
@@ -624,7 +622,7 @@ rcu_torture_writer(void *arg)
 		rcu_stutter_wait();
 	} while (!kthread_should_stop() && !fullstop);
 	VERBOSE_PRINTK_STRING("rcu_torture_writer task stopping");
-	while (!kthread_should_stop() && fullstop != FULLSTOP_SIGNALED)
+	while (!kthread_should_stop() && fullstop != FULLSTOP_SHUTDOWN)
 		schedule_timeout_uninterruptible(1);
 	return 0;
 }
@@ -649,7 +647,7 @@ rcu_torture_fakewriter(void *arg)
 	} while (!kthread_should_stop() && !fullstop);
 
 	VERBOSE_PRINTK_STRING("rcu_torture_fakewriter task stopping");
-	while (!kthread_should_stop() && fullstop != FULLSTOP_SIGNALED)
+	while (!kthread_should_stop() && fullstop != FULLSTOP_SHUTDOWN)
 		schedule_timeout_uninterruptible(1);
 	return 0;
 }
@@ -759,7 +757,7 @@ rcu_torture_reader(void *arg)
 	VERBOSE_PRINTK_STRING("rcu_torture_reader task stopping");
 	if (irqreader && cur_ops->irqcapable)
 		del_timer_sync(&t);
-	while (!kthread_should_stop() && fullstop != FULLSTOP_SIGNALED)
+	while (!kthread_should_stop() && fullstop != FULLSTOP_SHUTDOWN)
 		schedule_timeout_uninterruptible(1);
 	return 0;
 }

^ permalink raw reply related	[flat|nested] 27+ messages in thread

* Re: [BUG] NULL pointer deref with rcutorture
  2009-01-05  2:28                     ` Paul E. McKenney
@ 2009-01-05 12:14                       ` Eric Sesterhenn
  2009-01-05 18:00                         ` Paul E. McKenney
  0 siblings, 1 reply; 27+ messages in thread
From: Eric Sesterhenn @ 2009-01-05 12:14 UTC (permalink / raw)
  To: Paul E. McKenney; +Cc: Kamalesh Babulal, linux-kernel, josh, dipankar

* Paul E. McKenney (paulmck@linux.vnet.ibm.com) wrote:
> On Mon, Jan 05, 2009 at 12:38:55AM +0100, Eric Sesterhenn wrote:
> > hi,
> > 
> > * Paul E. McKenney (paulmck@linux.vnet.ibm.com) wrote:
> > > On Sun, Jan 04, 2009 at 03:57:26PM +0100, Eric Sesterhenn wrote:
> > > > 
> > > > Just tell me what i need to do, I am not really familiar with ftrace.
> > > > I am only able to test 2.6.28-04980-gb58602a, since current -git is not
> > > > able to boot on this box :|
> > > 
> > > Very cool!
> > > 
> > > The idea is to have __call_rcu() in kernel/rcutree.c record the
> > > address of the callback (argument "head") and the function (argument
> > > "func").  In rcu_do_batch(), just before invoking list->func(list),
> > > also record the address of the callback ("list") and the function
> > > (again, "func").
> > > 
> > > The new ftrace package has some mechanisms for doing this, but there is
> > > always the old-fashioned way of using printk(), for example in
> > > rcu_do_batch():
> > > 
> > > 	prefetch(next);
> > > 	if (rcu_dump_callbacks)
> > > 		printk("rcu_head=%p, func=%p\n", list, func);
> > > 	list->func(list);
> > > 
> > > Initialize rcu_dump_callbacks to zero, then use a small kernel module
> > > (or some such) to set it to one just before running your test.
> > 
> > i did it via the ugly printk and captured it via netconsole
> 
> Good stuff!!!
> 
> > for the box with rcutree you can find the log here:
> > http://www.cccmz.de/~snakebyte/200.log
> > The interesting part seems start at 155.858923, thats when I load
> > the rcutorture module
> 
> When I download this, I see an 82.100963 immediately followed by an
> 167.894017, no 155.858923.  Do I have the right log?
> 
> > with http://www.cccmz.de/~snakebyte/200_1.log i had another try,
> > actually saw a rcu_do_batch: rcu_head=d1907720, func=(null) in the traces,
> > box went into unresponsive mode afterwards
> 
> Interesting.  The original function is 0xd1902ad0, passed to __call_rcu()
> at 157.364214 -- would you be able to tell me what function that
> corresponds to?  (Looks to me like a module, perhaps rcutorture?)

I did the log more than once, guess this was the time from the one
i uploaded earlier with the same name

> Whatever function it is, the rcu_head definitely got corrupted some
> time during the 8 milliseconds or so that the callback was waiting for
> a grace period.  However, it did just fine being invoked several times
> beforehand -- this is a very popular RCU callback function, it appears.
>
> My guess is that the ensuing CPU 0 stall messages are due to CPU 0
> having a heart attack trying to execute at address 0.

Could the popular rcu function be registered by rcutorture, but when
we remove the module the callback is no longer valid? I can compile
a kernel just fine and with other stress tests i did not see any oops so
far.

> Are you swapping over NFS, or is the netpoll_send_udp() just garbage on
> the stack?

Swap is locally on a disc, nfs is configured but not used for swap.

> > And just in case it helps somehow: http://www.cccmz.de/~snakebyte/200_3.log
> 
> Hmmm...  The main recent change to kernel/rcutorture.c is the addition
> of a reboot notifier, so that rcutorture can shut itself down gracefully
> should the system go down while rcutorture is still running.  This panic
> occurs in rcu_torture_stutter(), which is shown below:
> 
> 	static int
> 	rcu_torture_stutter(void *arg)
> 	{
> 		VERBOSE_PRINTK_STRING("rcu_torture_stutter task started");
> 		do {
> 			schedule_timeout_interruptible(stutter * HZ);
> 			stutter_pause_test = 1;
> 			if (!kthread_should_stop() && !fullstop)
> 				schedule_timeout_interruptible(stutter * HZ);
> 			stutter_pause_test = 0;
> 		} while (!kthread_should_stop() && !fullstop);
> 		VERBOSE_PRINTK_STRING("rcu_torture_stutter task stopping");
> 		return 0;
> 	}
> 
> I don't see any reason why this function would transfer control to
> location 0x60, though I do see a bug in the new shutdown code.  I
> don't expect this to make any difference, but a patch is included
> nevertheless.

Sadly it doesnt.

> > for the one with rcupreemt you can find the log here:
> > http://www.cccmz.de/~snakebyte/201.log
> > rcutorture is loaded at 87.312399
> 
> Hmmm...  Offset 0x60 seems to be a common thread.  This lets
> rcu_torture_ops off the hook, since it isn't that large.
> 
> Anyway, could you give the attached patch a go, even though I cannot see
> how it would help?  ;-)

This is with the patch on the preemt rcu box (the 201.logs)

[  593.404728] rcu-torture:--- Start of test: nreaders=2 nfakewriters=4
stat_interval=0 verbose=0 test_no_idle_hz=0 shuffle_interval=3 stutter=5
irqreader=1
[  601.000124] BUG: unable to handle kernel paging request at d0af8a2d
[  601.000431] IP: [<d0af8a2d>] 0xd0af8a2d
[  601.000696] *pde = 0fbb4067 *pte = 00000000 
[  601.000946] Oops: 0000 [#1] PREEMPT DEBUG_PAGEALLOC
[  601.001027] last sysfs file: /sys/block/ram9/range
[  601.001027] Modules linked in: [last unloaded: rcutorture]
[  601.001027] 
[  601.001027] Pid: 5021, comm: rcu_torture_fak Tainted: G        W
(2.6.28-05692-g7d3b56b-dirty #169) System Name
[  601.001027] EIP: 0060:[<d0af8a2d>] EFLAGS: 00010246 CPU: 0
[  601.001027] EIP is at 0xd0af8a2d
[  601.001027] EAX: 00000000 EBX: c98acfc4 ECX: c04f2c53 EDX: c98acf74
[  601.001027] ESI: d0af8c0a EDI: 00000000 EBP: c98acfbc ESP: c98acfbc
[  601.001027]  DS: 007b ES: 007b FS: 0000 GS: 0000 SS: 0068
[  601.001027] Process rcu_torture_fak (pid: 5021, ti=c98ac000
task=cbb64100 task.ti=c98ac000)
[  601.001027] Stack:
[  601.001027]  c98acfd0 d0af8c89 74816929 000024a7 00000000 c98acfe0
c0136bcc c0136b8e
[  601.001027]  00000000 00000000 c0103643 c9897ea4 00000000 00000000
00000000 00001301
[  601.001027]  00008169
[  601.001027] Call Trace:
[  601.001027]  [<c0136bcc>] ? kthread+0x3e/0x66
[  601.001027]  [<c0136b8e>] ? kthread+0x0/0x66
[  601.001027]  [<c0103643>] ? kernel_thread_helper+0x7/0x10
[  601.001027] Code:  Bad EIP value.
[  601.001027] EIP: [<d0af8a2d>] 0xd0af8a2d SS:ESP 0068:c98acfbc
[  601.001027] ---[ end trace 4eaa2a86a8e2da22 ]---
[  601.006764] BUG: unable to handle kernel paging request at d0af8a2d
[  601.007023] IP: [<d0af8a2d>] 0xd0af8a2d
[  601.007265] *pde = 0fbb4067 *pte = 00000000 
[  601.007511] Oops: 0000 [#2] PREEMPT DEBUG_PAGEALLOC
[  601.007716] last sysfs file: /sys/block/ram9/range
[  601.007716] Modules linked in: [last unloaded: rcutorture]
[  601.007716] 
[  601.007716] Pid: 5020, comm: rcu_torture_fak Tainted: G      D W
(2.6.28-05692-g7d3b56b-dirty #169) System Name
[  601.007716] EIP: 0060:[<d0af8a2d>] EFLAGS: 00010296 CPU: 0
[  601.007716] EIP is at 0xd0af8a2d
[  601.007716] EAX: 00000000 EBX: c98adfc4 ECX: c04f2c53 EDX: c98adf74
[  601.007716] ESI: d0af8c0a EDI: 00000000 EBP: c98adfbc ESP: c98adfbc
[  601.007716]  DS: 007b ES: 007b FS: 0000 GS: 0000 SS: 0068
[  601.007716] Process rcu_torture_fak (pid: 5020, ti=c98ad000
task=cbb62700 task.ti=c98ad000)
[  601.007716] Stack:
[  601.007716]  c98adfd0 d0af8c89 e5bc239c 000024ab 00000000 c98adfe0
c0136bcc c0136b8e
[  601.007716]  00000000 00000000 c0103643 c9897ea4 00000000 00000000
00000000 00000201
[  601.007716]  00000674
[  601.007716] Call Trace:
[  601.007716]  [<c0136bcc>] ? kthread+0x3e/0x66
[  601.007716]  [<c0136b8e>] ? kthread+0x0/0x66
[  601.007716]  [<c0103643>] ? kernel_thread_helper+0x7/0x10
[  601.007716] Code:  Bad EIP value.
[  601.007716] EIP: [<d0af8a2d>] 0xd0af8a2d SS:ESP 0068:c98adfbc
[  601.007716] ---[ end trace 4eaa2a86a8e2da22 ]---
[  601.013281] BUG: unable to handle kernel paging request at d0af8a2d
[  601.013542] IP: [<d0af8a2d>] 0xd0af8a2d
[  601.013785] *pde = 0fbb4067 *pte = 00000000 
[  601.014032] Oops: 0000 [#3] PREEMPT DEBUG_PAGEALLOC
[  601.014239] last sysfs file: /sys/block/ram9/range
[  601.014239] Modules linked in: [last unloaded: rcutorture]
[  601.014239] 
[  601.014239] Pid: 5019, comm: rcu_torture_fak Tainted: G      D W
(2.6.28-05692-g7d3b56b-dirty #169) System Name
[  601.014239] EIP: 0060:[<d0af8a2d>] EFLAGS: 00010286 CPU: 0
[  601.014239] EIP is at 0xd0af8a2d
[  601.014239] EAX: 00000000 EBX: c98b2fc4 ECX: c04f2c53 EDX: c98b2f74
[  601.014239] ESI: d0af8c0a EDI: 00000000 EBP: c98b2fbc ESP: c98b2fbc
[  601.014239]  DS: 007b ES: 007b FS: 0000 GS: 0000 SS: 0068
[  601.014239] Process rcu_torture_fak (pid: 5019, ti=c98b2000
task=c9804e00 task.ti=c98b2000)
[  601.014239] Stack:
[  601.014239]  c98b2fd0 d0af8c89 61a1df8f 000024a5 00000000 c98b2fe0
c0136bcc c0136b8e
[  601.014239]  00000000 00000000 c0103643 c9897ea4 00000000 00000000
00000000 000007bf
[  601.014239]  00000012
[  601.014239] Call Trace:
[  601.014239]  [<c0136bcc>] ? kthread+0x3e/0x66
[  601.014239]  [<c0136b8e>] ? kthread+0x0/0x66
[  601.014239]  [<c0103643>] ? kernel_thread_helper+0x7/0x10
[  601.014239] Code:  Bad EIP value.
[  601.014239] EIP: [<d0af8a2d>] 0xd0af8a2d SS:ESP 0068:c98b2fbc
[  601.014239] ---[ end trace 4eaa2a86a8e2da22 ]---

then i added the rcutree patch (Fix rcutree grace-period-latency bug on
small systems) and got

[   65.968568] rcu-torture:--- Start of test: nreaders=2 nfakewriters=4
stat_interval=0 verbose=0 test_no_idle_hz=0 shuffle_interval=3 stutter=5
irqreader=1
[   68.872082] BUG: unable to handle kernel NULL pointer dereference at
(null)
[   68.872455] IP: [<(null)>] (null)
[   68.872653] *pde = 00000000 
[   68.872849] Oops: 0000 [#1] PREEMPT DEBUG_PAGEALLOC
[   68.873027] last sysfs file: /sys/block/ram9/range
[   68.873027] Modules linked in: [last unloaded: rcutorture]
[   68.873027] 
[   68.873027] Pid: 5023, comm: bash Tainted: G        W
(2.6.28-05692-g7d3b56b-dirty #169) System Name
[   68.873027] EIP: 0060:[<00000000>] EFLAGS: 00010246 CPU: 0
[   68.873027] EIP is at 0x0
[   68.873027] EAX: d0afd0b8 EBX: 00000000 ECX: c01612a6 EDX: 00000006
[   68.873027] ESI: d0afd0b8 EDI: 0000001c EBP: c0b03fe0 ESP: c0b03fd4
[   68.873027]  DS: 007b ES: 007b FS: 0000 GS: 0033 SS: 0068
[   68.873027] Process bash (pid: 5023, ti=c0b03000 task=c98d1a00
task.ti=ce824000)
[   68.873027] Stack:
[   68.873027]  c01612ad 00000200 00000001 c0b03ff8 c012aa97 0000000a
ce824e98 00000046
[   68.873027]  c012aa28 ce824ea8 c01042c2
[   68.873027] Call Trace:
[   68.873027]  [<c01612ad>] ? rcu_process_callbacks+0x65/0x79
[   68.873027]  [<c012aa97>] ? __do_softirq+0x6f/0xf6
[   68.873027]  [<c012aa28>] ? __do_softirq+0x0/0xf6
[   68.873027]  <IRQ> <0> [<c012a9a5>] ? irq_exit+0x40/0x7c
[   68.873027]  [<c0110ce1>] ? smp_apic_timer_interrupt+0x68/0x73
[   68.873027]  [<c0103521>] ? apic_timer_interrupt+0x2d/0x34
[   68.873027]  [<c07a024c>] ? _spin_lock+0x56/0x5d
[   68.873027]  [<c07a0000>] ? _write_unlock_bh+0x28/0x37
[   68.873027]  [<c017ad50>] ? do_wp_page+0x38b/0x453
[   68.873027]  [<c017be9e>] ? handle_mm_fault+0x4e8/0x54d
[   68.873027]  [<c0139da3>] ? down_read_trylock+0x3d/0x47
[   68.873027]  [<c07a25d3>] ? do_page_fault+0x2dc/0x601
[   68.873027]  [<c07a22f7>] ? do_page_fault+0x0/0x601
[   68.873027]  [<c07a0bd7>] ? error_code+0x6f/0x74
[   68.873027] Code:  Bad EIP value.
[   68.873027] EIP: [<00000000>] 0x0 SS:ESP 0068:c0b03fd4
[   68.880266] ---[ end trace 4eaa2a86a8e2da22 ]---
[   68.880394] Kernel panic - not syncing: Fatal exception in interrupt


Tried again without the rcutree patch and got this one,
guess its just a timing thing wheter the NULL pointer deref or the other
appears

[   99.643517] rcu-torture:--- Start of test: nreaders=2 nfakewriters=4
stat_interval=0 verbose=0 test_no_idle_hz=0 shuffle_interval=3 stutter=5
irqreader=1
[  104.149086] BUG: unable to handle kernel NULL pointer dereference at
(null)
[  104.149418] IP: [<(null)>] (null)
[  104.149615] *pde = 00000000 
[  104.149813] Oops: 0000 [#1] PREEMPT DEBUG_PAGEALLOC
[  104.150029] last sysfs file: /sys/block/ram9/range
[  104.150029] Modules linked in: [last unloaded: rcutorture]
[  104.150029] 
[  104.150029] Pid: 5030, comm: udevd Tainted: G        W
(2.6.28-05692-g7d3b56b-dirty #169) System Name
[  104.150029] EIP: 0060:[<00000000>] EFLAGS: 00010246 CPU: 0
[  104.150029] EIP is at 0x0
[  104.150029] EAX: d0afcc38 EBX: 00000000 ECX: c01612a6 EDX: 00000006
[  104.150029] ESI: d0afcc38 EDI: 0000001c EBP: c0b03fe0 ESP: c0b03fd4
[  104.150029]  DS: 007b ES: 007b FS: 0000 GS: 0033 SS: 0068
[  104.150029] Process udevd (pid: 5030, ti=c0b03000 task=c98cb400
task.ti=c98b1000)
[  104.150029] Stack:
[  104.150029]  c01612ad 00000200 00000001 c0b03ff8 c012aa97 0000000a
c98b1e40 00000046
[  104.150029]  c012aa28 c98b1e50 c01042c2
[  104.150029] Call Trace:
[  104.150029]  [<c01612ad>] ? rcu_process_callbacks+0x65/0x79
[  104.150029]  [<c012aa97>] ? __do_softirq+0x6f/0xf6
[  104.150029]  [<c012aa28>] ? __do_softirq+0x0/0xf6
[  104.150029]  <IRQ> <0> [<c012a9a5>] ? irq_exit+0x40/0x7c
[  104.150029]  [<c0110ce1>] ? smp_apic_timer_interrupt+0x68/0x73
[  104.150029]  [<c0103521>] ? apic_timer_interrupt+0x2d/0x34
[  104.150029]  [<c016e63f>] ? find_lock_page+0x5f/0x63
[  104.150029]  [<c0170122>] ? filemap_fault+0x9f/0x391
[  104.150029]  [<c017a6e8>] ? __do_fault+0x40/0x31d
[  104.150029]  [<c017bc30>] ? handle_mm_fault+0x27a/0x54d
[  104.150029]  [<c0139da3>] ? down_read_trylock+0x3d/0x47
[  104.150029]  [<c07a25d3>] ? do_page_fault+0x2dc/0x601
[  104.150029]  [<c07a22f7>] ? do_page_fault+0x0/0x601
[  104.150029]  [<c07a0bd7>] ? error_code+0x6f/0x74
[  104.150029] Code:  Bad EIP value.
[  104.150029] EIP: [<00000000>] 0x0 SS:ESP 0068:c0b03fd4
[  104.157313] ---[ end trace 4eaa2a86a8e2da22 ]---
[  104.157441] Kernel panic - not syncing: Fatal exception in interrupt

I enabled classic rcu and pulled current -git (without both patches) and got these:

[   67.424425] rcu-torture:--- Start of test: nreaders=2 nfakewriters=4
stat_interval=0 verbose=0 test_no_idle_hz=0 shuffle_interval=3 stutter=5
irqreader=1
[   70.205083] BUG: unable to handle kernel NULL pointer dereference at
(null)
[   70.205416] IP: [<(null)>] (null)
[   70.205605] *pde = 00000000 
[   70.205798] Oops: 0000 [#1] PREEMPT DEBUG_PAGEALLOC
[   70.206026] last sysfs file: /sys/block/ram9/range
[   70.206026] Modules linked in: [last unloaded: rcutorture]
[   70.206026] 
[   70.206026] Pid: 1726, comm: udevd Tainted: G        W
(2.6.28-05716-gfe0bdec #170) System Name
[   70.206026] EIP: 0060:[<00000000>] EFLAGS: 00010202 CPU: 0
[   70.206026] EIP is at 0x0
[   70.206026] EAX: d0afc018 EBX: c10697c0 ECX: c0161905 EDX: d0afc018
[   70.206026] ESI: 00000000 EDI: 00000001 EBP: c0b0bfd8 ESP: c0b0bfc4
[   70.206026]  DS: 007b ES: 007b FS: 0000 GS: 0033 SS: 0068
[   70.206026] Process udevd (pid: 1726, ti=c0b0b000 task=cee10000
task.ti=ce80b000)
[   70.206026] Stack:
[   70.206026]  c016197c 00000202 00000200 00000001 0000001c c0b0bfe0
c0161a00 c0b0bff8
[   70.206026]  c012acca 0000000a ce80bec4 00000046 c012ac5b ce80bed4
c01042c2
[   70.206026] Call Trace:
[   70.206026]  [<c016197c>] ? __rcu_process_callbacks+0x141/0x1ae
[   70.206026]  [<c0161a00>] ? rcu_process_callbacks+0x17/0x28
[   70.206026]  [<c012acca>] ? __do_softirq+0x6f/0x100
[   70.206026]  [<c012ac5b>] ? __do_softirq+0x0/0x100
[   70.206026]  <IRQ> <0> [<c012abdd>] ? irq_exit+0x40/0x77
[   70.206026]  [<c0110ce9>] ? smp_apic_timer_interrupt+0x68/0x73
[   70.206026]  [<c0103521>] ? apic_timer_interrupt+0x2d/0x34
[   70.206026]  [<c014007b>] ? tick_do_periodic_broadcast+0xf/0x3a
[   70.206026]  [<c018f325>] ? kmem_cache_free+0xbd/0xc5
[   70.206026]  [<c019a45d>] ? putname+0x28/0x33
[   70.206026]  [<c019a45d>] ? putname+0x28/0x33
[   70.206026]  [<c019b7a0>] ? do_rmdir+0xba/0xc3
[   70.206026]  [<c04e8b50>] ? trace_hardirqs_on_thunk+0xc/0x10
[   70.206026]  [<c0102ecf>] ? sysenter_exit+0xf/0x16
[   70.206026]  [<c01448f1>] ? trace_hardirqs_on_caller+0x17/0x15a
[   70.206026]  [<c019b7f2>] ? sys_rmdir+0x15/0x17
[   70.206026]  [<c0102ea1>] ? sysenter_do_call+0x12/0x31
[   70.206026] Code:  Bad EIP value.
[   70.206026] EIP: [<00000000>] 0x0 SS:ESP 0068:c0b0bfc4
[   70.213814] ---[ end trace 4eaa2a86a8e2da22 ]---
[   70.213934] Kernel panic - not syncing: Fatal exception in interrupt




On the other box with tree rcu and the rcutorture patch (200.logs):


[  168.408126] BUG: unable to handle kernel NULL pointer dereference at
00000029
[  168.408433] IP: [<d1902ebf>] 0xd1902ebf
[  168.408614] *pde = 00000000 
[  168.408778] Oops: 0000 [#1] DEBUG_PAGEALLOC
[  168.408978] last sysfs file: /sys/block/sda/size
[  168.409164] Modules linked in: nfsd exportfs nfs lockd nfs_acl
auth_rpcgss sunrpc ipv6 fuse unix [last unloaded: rcutorture]
[  168.410099] 
[  168.410197] Pid: 4154, comm: rcu_torture_rea Not tainted (2.6.28 #83) 
[  168.410332] EIP: 0060:[<d1902ebf>] EFLAGS: 00010202 CPU: 0
[  168.410516] EIP is at 0xd1902ebf
[  168.410628] EAX: 00000001 EBX: 00000001 ECX: 00000000 EDX: 00000000
[  168.410760] ESI: 00000000 EDI: 00000000 EBP: c7d52fd0 ESP: c7d52f84
[  168.410992]  DS: 007b ES: 007b FS: 0000 GS: 0000 SS: 0068
[  168.411120] Process rcu_torture_rea (pid: 4154, ti=c7d52000
task=c7b73710 task.ti=c7d52000)
[  168.411302] Stack:
[  168.411398]  00000002 00000000 c0936a24 c7b48f8c 00000001 d1903040
00000000 c0936000
[  168.411981]  d1902ff7 5f756372 74726f74 5f657275 00616572 0000103a
2e11c1af 00000dd1
[  168.412021]  00000000 d1902e50 00000000 c7d52fe0 c013d81a c013d7e0
00000000 00000000
[  168.412021] Call Trace:
[  168.412021]  [<c013d81a>] ? kthread+0x3a/0x70
[  168.412021]  [<c013d7e0>] ? kthread+0x0/0x70
[  168.412021]  [<c0103cf3>] ? kernel_thread_helper+0x7/0x14
[  168.412021] Code: 28 85 f6 74 1c 8d 45 bc c7 45 c8 40 30 90 d1 c7 45
cc 00 00 00 00 e8 91 03 83 ee 90 a1 00 5e 90 d1 8b 1d a4 59 90 d1 85 db
74 12 <8b> 48 28 85 c9 74 0b 8b 55 bc 85 d2 0f 84 19 01 00 00 8d b4 26 
[  168.412021] EIP: [<d1902ebf>] 0xd1902ebf SS:ESP 0068:c7d52f84
[  168.418864] ---[ end trace 4bbd74c2c853213b ]---
[  168.429775] __call_rcu: rcu_head=c3f47f24, func=c01b5a70
[  168.429958] __call_rcu: rcu_head=c7bbecac, func=c0143580
[  168.430129] __call_rcu: rcu_head=c7dbeea0, func=c01a6ae0
[  168.431826] __call_rcu: rcu_head=c7de9328, func=c01a6ae0
[  168.432089] rcu_do_batch: rcu_head=c3f47f24, func=c01b5a70
[  168.432869] __call_rcu: rcu_head=c994fe18, func=c01b9c20
[  168.436064] rcu_do_batch: rcu_head=c7bbecac, func=c0143580
[  168.436226] rcu_do_batch: rcu_head=c7dbeea0, func=c01a6ae0
[  168.444046] rcu_do_batch: rcu_head=c7de9328, func=c01a6ae0
[  168.444193] rcu_do_batch: rcu_head=c994fe18, func=c01b9c20
[  168.445866] BUG: unable to handle kernel NULL pointer dereference at
00000029
[  168.446126] IP: [<d1902ebf>] 0xd1902ebf
[  168.446315] *pde = 00000000 
[  168.446488] Oops: 0000 [#2] DEBUG_PAGEALLOC
[  168.446692] last sysfs file: /sys/block/sda/size
[  168.446883] Modules linked in: nfsd exportfs nfs lockd nfs_acl
auth_rpcgss sunrpc ipv6 fuse unix [last unloaded: rcutorture]
[  168.447818] 
[  168.447916] Pid: 4153, comm: rcu_torture_rea Tainted: G      D
(2.6.28 #83) 
[  168.448022] EIP: 0060:[<d1902ebf>] EFLAGS: 00010202 CPU: 0
[  168.448022] EIP is at 0xd1902ebf
[  168.448022] EAX: 00000001 EBX: 00000001 ECX: 00000000 EDX: 00000000
[  168.448022] ESI: 00000000 EDI: 00000000 EBP: c7b48fd0 ESP: c7b48f84
[  168.448022]  DS: 007b ES: 007b FS: 0000 GS: 0000 SS: 0068
[  168.448022] Process rcu_torture_rea (pid: 4153, ti=c7b48000
task=c7b5c710 task.ti=c7b48000)
[  168.448022] Stack:
[  168.448022]  00000002 00000000 c7d52f8c c7de5f74 00000001 d1903040
00000000 c0936000
[  168.448022]  d1902ff7 5f756372 74726f74 5f657275 00616572 00001039
b87c4524 00000ca3
[  168.448022]  00000000 d1902e50 00000000 c7b48fe0 c013d81a c013d7e0
00000000 00000000
[  168.448022] Call Trace:
[  168.448022]  [<c013d81a>] ? kthread+0x3a/0x70
[  168.448022]  [<c013d7e0>] ? kthread+0x0/0x70
[  168.448022]  [<c0103cf3>] ? kernel_thread_helper+0x7/0x14
[  168.448022] Code: 28 85 f6 74 1c 8d 45 bc c7 45 c8 40 30 90 d1 c7 45
cc 00 00 00 00 e8 91 03 83 ee 90 a1 00 5e 90 d1 8b 1d a4 5[  168.457352]
BUG: unable to handle kernel NULL pointer dereference at 0000001d
[  168.457603] IP: [<d1903163>] 0xd1903163
[  168.457776] *pde = 00000000 
[  168.457942] Oops: 0000 [#3] DEBUG_PAGEALLOC
[  168.458141] last sysfs file: /sys/block/sda/size
[  168.458323] Modules linked in: nfsd exportfs nfs lockd nfs_acl
auth_rpcgss sunrpc ipv6 fuse unix [last unloaded: rcutorture]
[  168.459254] 
[  168.459350] Pid: 4151, comm: rcu_torture_fak Tainted: G      D
(2.6.28 #83) 
[  168.459523] EIP: 0060:[<d1903163>] EFLAGS: 00010287 CPU: 0
[  168.459706] EIP is at 0xd1903163
[  168.459818] EAX: 00000001 EBX: c7dcafc0 ECX: 35502ab8 EDX: 00000021
[  168.459949] ESI: cccccccd EDI: 00000000 EBP: c7dcafd0 ESP: c7dcafb8
[  168.460080]  DS: 007b ES: 007b FS: 0000 GS: 0000 SS: 0068
[  168.460262] Process rcu_torture_fak (pid: 4151, ti=c7dca000
task=c7dcb710 task.ti=c7dca000)
[  168.460442] Stack:
[  168.460485]  c7dbadd8 00000000 01f30e05 00002699 00000000 d19030f0
c7dcafe0 c013d81a
[  168.460485]  c013d7e0 00000000 00000000 c0103cf3 c7ca9e70 00000000
00000000 00000000
[  168.460485]  20df0493 0a860000
[  168.460485] Call Trace:
[  168.460485]  [<c013d81a>] ? kthread+0x3a/0x70
[  168.460485]  [<c013d7e0>] ? kthread+0x0/0x70
[  168.460485]  [<c0103cf3>] ? kernel_thread_helper+0x7/0x14
[  168.460485] Code: c1 f7 e6 c1 ea 03 8d 14 92 01 d2 29 d1 8d 41 01 e8
23 83 ca ee 89 d8 e8 1c f7 ff ff 25 ff 03 00 00 e8 f2 c7 99 ee a1 00 5e
90 d1 <ff> 50 1c e8 75 fc ff ff e8 d0 a4 83 ee 85 c0 8d b6 00 00 00 00 
[  168.460485] EIP: [<d1903163>] 0xd1903163 SS:ESP 0068:c7dcafb8
[  168.467531] ---[ end trace 4bbd74c2c853213b ]---
[  168.476855] BUG: unable to handle kernel NULL pointer dereference at
00000004
[  168.477134] IP: [<d1903233>] 0xd1903233
[  168.477322] *pde = 00000000 
[  168.477485] Oops: 0002 [#4] DEBUG_PAGEALLOC
[  168.477680] last sysfs file: /sys/block/sda/size
[  168.477857] Modules linked in: nfsd exportfs nfs lockd nfs_acl
auth_rpcgss sunrpc ipv6 fuse unix [last unloaded: rcutorture]
[  168.478774] 
[  168.478871] Pid: 4147, comm: rcu_torture_wri Tainted: G      D
(2.6.28 #83) 
[  168.479045] EIP: 0060:[<d1903233>] EFLAGS: 00010202 CPU: 0
[  168.479228] EIP is at 0xd1903233
[  168.479340] EAX: 00000000 EBX: d19073fc ECX: 00000001 EDX: 00000000
[  168.479472] ESI: 0000000a EDI: 00000000 EBP: c7e05fd0 ESP: c7e05fc0
[  168.479604]  DS: 007b ES: 007b FS: 0000 GS: 0000 SS: 0068
[  168.479787] Process rcu_torture_wri (pid: 4147, ti=c7e05000
task=c994e710 task.ti=c7e05000)
[  168.479968] Stack:
[  168.480015]  00000000 00000000 00000000 d19031f0 c7e05fe0 c013d81a
c013d7e0 00000000
[  168.480015]  00000000 c0103cf3 c7ca9e70 00000000 00000000 00000000
7574726f 725f6572
[  168.480015] Call Trace:
[  168.480015]  [<c013d81a>] ? kthread+0x3a/0x70
[  168.480015]  [<c013d7e0>] ? kthread+0x0/0x70
[  168.480015]  [<c0103cf3>] ? kernel_thread_helper+0x7/0x14
[  168.480015] Code: 01 00 00 ba 13 00 00 00 be 0a 00 00 00 a1 00 20 83
c0 e8 a1 21 82 ee e9 a0 00 00 00 8d 74 26 00 ff 05 6c 78 90 d1 8b 43 04
8b 13 <89> 42 04 89 10 b8 cc 59 90 d1 89 1b 89 5b 04 e8 19 a3 ca ee 83 
[  168.480015] EIP: [<d1903233>] 0xd1903233 SS:ESP 0068:c7e05fc0
[  168.486680] ---[ end trace 4bbd74c2c853213b ]---
[  168.486798] Kernel panic - not syncing: Fatal exception in interrupt
9 90 d1 85 db
74 12 <8b> 48 28 85 c9 74 0b 8b 55 bc 85 d2 0f 84 19 01 00 00 8d b4 26 
[  168.448022] EIP: [<d1902ebf>] 0xd1902ebf SS:ESP 0068:c7b48f84
[  168.456684] ---[ end trace 4bbd74c2c853213b ]---


Greetings, Eric

^ permalink raw reply	[flat|nested] 27+ messages in thread

* Re: [BUG] NULL pointer deref with rcutorture
  2009-01-05 12:14                       ` Eric Sesterhenn
@ 2009-01-05 18:00                         ` Paul E. McKenney
  2009-01-05 18:56                           ` Eric Sesterhenn
  0 siblings, 1 reply; 27+ messages in thread
From: Paul E. McKenney @ 2009-01-05 18:00 UTC (permalink / raw)
  To: Eric Sesterhenn; +Cc: Kamalesh Babulal, linux-kernel, josh, dipankar

On Mon, Jan 05, 2009 at 01:14:09PM +0100, Eric Sesterhenn wrote:
> * Paul E. McKenney (paulmck@linux.vnet.ibm.com) wrote:
> > On Mon, Jan 05, 2009 at 12:38:55AM +0100, Eric Sesterhenn wrote:
> > > hi,
> > > 
> > > * Paul E. McKenney (paulmck@linux.vnet.ibm.com) wrote:
> > > > On Sun, Jan 04, 2009 at 03:57:26PM +0100, Eric Sesterhenn wrote:
> > > > > 
> > > > > Just tell me what i need to do, I am not really familiar with ftrace.
> > > > > I am only able to test 2.6.28-04980-gb58602a, since current -git is not
> > > > > able to boot on this box :|
> > > > 
> > > > Very cool!
> > > > 
> > > > The idea is to have __call_rcu() in kernel/rcutree.c record the
> > > > address of the callback (argument "head") and the function (argument
> > > > "func").  In rcu_do_batch(), just before invoking list->func(list),
> > > > also record the address of the callback ("list") and the function
> > > > (again, "func").
> > > > 
> > > > The new ftrace package has some mechanisms for doing this, but there is
> > > > always the old-fashioned way of using printk(), for example in
> > > > rcu_do_batch():
> > > > 
> > > > 	prefetch(next);
> > > > 	if (rcu_dump_callbacks)
> > > > 		printk("rcu_head=%p, func=%p\n", list, func);
> > > > 	list->func(list);
> > > > 
> > > > Initialize rcu_dump_callbacks to zero, then use a small kernel module
> > > > (or some such) to set it to one just before running your test.
> > > 
> > > i did it via the ugly printk and captured it via netconsole
> > 
> > Good stuff!!!
> > 
> > > for the box with rcutree you can find the log here:
> > > http://www.cccmz.de/~snakebyte/200.log
> > > The interesting part seems start at 155.858923, thats when I load
> > > the rcutorture module
> > 
> > When I download this, I see an 82.100963 immediately followed by an
> > 167.894017, no 155.858923.  Do I have the right log?
> > 
> > > with http://www.cccmz.de/~snakebyte/200_1.log i had another try,
> > > actually saw a rcu_do_batch: rcu_head=d1907720, func=(null) in the traces,
> > > box went into unresponsive mode afterwards
> > 
> > Interesting.  The original function is 0xd1902ad0, passed to __call_rcu()
> > at 157.364214 -- would you be able to tell me what function that
> > corresponds to?  (Looks to me like a module, perhaps rcutorture?)
> 
> I did the log more than once, guess this was the time from the one
> i uploaded earlier with the same name
> 
> > Whatever function it is, the rcu_head definitely got corrupted some
> > time during the 8 milliseconds or so that the callback was waiting for
> > a grace period.  However, it did just fine being invoked several times
> > beforehand -- this is a very popular RCU callback function, it appears.
> >
> > My guess is that the ensuing CPU 0 stall messages are due to CPU 0
> > having a heart attack trying to execute at address 0.
> 
> Could the popular rcu function be registered by rcutorture, but when
> we remove the module the callback is no longer valid? I can compile
> a kernel just fine and with other stress tests i did not see any oops so
> far.

One approach would be to print out the address of rcutorture's RCU
callbacks at rcutorture module initialization time (in rcu_torture_init()
in kernel/rcutorture.c).  The two callbacks are rcu_torture_cb() and
rcu_bh_torture_wakeme_after_cb().  Unless you are specifying the 
"torture_type" parameter to rcutorture, only the first one should be in
use.

> > Are you swapping over NFS, or is the netpoll_send_udp() just garbage on
> > the stack?
> 
> Swap is locally on a disc, nfs is configured but not used for swap.

OK, just garbage on the stack, then.

> > > And just in case it helps somehow: http://www.cccmz.de/~snakebyte/200_3.log
> > 
> > Hmmm...  The main recent change to kernel/rcutorture.c is the addition
> > of a reboot notifier, so that rcutorture can shut itself down gracefully
> > should the system go down while rcutorture is still running.  This panic
> > occurs in rcu_torture_stutter(), which is shown below:
> > 
> > 	static int
> > 	rcu_torture_stutter(void *arg)
> > 	{
> > 		VERBOSE_PRINTK_STRING("rcu_torture_stutter task started");
> > 		do {
> > 			schedule_timeout_interruptible(stutter * HZ);
> > 			stutter_pause_test = 1;
> > 			if (!kthread_should_stop() && !fullstop)
> > 				schedule_timeout_interruptible(stutter * HZ);
> > 			stutter_pause_test = 0;
> > 		} while (!kthread_should_stop() && !fullstop);
> > 		VERBOSE_PRINTK_STRING("rcu_torture_stutter task stopping");
> > 		return 0;
> > 	}
> > 
> > I don't see any reason why this function would transfer control to
> > location 0x60, though I do see a bug in the new shutdown code.  I
> > don't expect this to make any difference, but a patch is included
> > nevertheless.
> 
> Sadly it doesnt.

:-/

> > > for the one with rcupreemt you can find the log here:
> > > http://www.cccmz.de/~snakebyte/201.log
> > > rcutorture is loaded at 87.312399
> > 
> > Hmmm...  Offset 0x60 seems to be a common thread.  This lets
> > rcu_torture_ops off the hook, since it isn't that large.
> > 
> > Anyway, could you give the attached patch a go, even though I cannot see
> > how it would help?  ;-)
> 
> This is with the patch on the preemt rcu box (the 201.logs)

Looking these over, I would be surprised if this is a problem in the RCU
infrastructure itself, given that it shows up with all three versions,
but it could easily be a problem in rcutorture.  I should be able to get
some tests running locally later today.

							Thanx, Paul

> [  593.404728] rcu-torture:--- Start of test: nreaders=2 nfakewriters=4
> stat_interval=0 verbose=0 test_no_idle_hz=0 shuffle_interval=3 stutter=5
> irqreader=1
> [  601.000124] BUG: unable to handle kernel paging request at d0af8a2d
> [  601.000431] IP: [<d0af8a2d>] 0xd0af8a2d
> [  601.000696] *pde = 0fbb4067 *pte = 00000000 
> [  601.000946] Oops: 0000 [#1] PREEMPT DEBUG_PAGEALLOC
> [  601.001027] last sysfs file: /sys/block/ram9/range
> [  601.001027] Modules linked in: [last unloaded: rcutorture]
> [  601.001027] 
> [  601.001027] Pid: 5021, comm: rcu_torture_fak Tainted: G        W
> (2.6.28-05692-g7d3b56b-dirty #169) System Name
> [  601.001027] EIP: 0060:[<d0af8a2d>] EFLAGS: 00010246 CPU: 0
> [  601.001027] EIP is at 0xd0af8a2d
> [  601.001027] EAX: 00000000 EBX: c98acfc4 ECX: c04f2c53 EDX: c98acf74
> [  601.001027] ESI: d0af8c0a EDI: 00000000 EBP: c98acfbc ESP: c98acfbc
> [  601.001027]  DS: 007b ES: 007b FS: 0000 GS: 0000 SS: 0068
> [  601.001027] Process rcu_torture_fak (pid: 5021, ti=c98ac000
> task=cbb64100 task.ti=c98ac000)
> [  601.001027] Stack:
> [  601.001027]  c98acfd0 d0af8c89 74816929 000024a7 00000000 c98acfe0
> c0136bcc c0136b8e
> [  601.001027]  00000000 00000000 c0103643 c9897ea4 00000000 00000000
> 00000000 00001301
> [  601.001027]  00008169
> [  601.001027] Call Trace:
> [  601.001027]  [<c0136bcc>] ? kthread+0x3e/0x66
> [  601.001027]  [<c0136b8e>] ? kthread+0x0/0x66
> [  601.001027]  [<c0103643>] ? kernel_thread_helper+0x7/0x10
> [  601.001027] Code:  Bad EIP value.
> [  601.001027] EIP: [<d0af8a2d>] 0xd0af8a2d SS:ESP 0068:c98acfbc
> [  601.001027] ---[ end trace 4eaa2a86a8e2da22 ]---
> [  601.006764] BUG: unable to handle kernel paging request at d0af8a2d
> [  601.007023] IP: [<d0af8a2d>] 0xd0af8a2d
> [  601.007265] *pde = 0fbb4067 *pte = 00000000 
> [  601.007511] Oops: 0000 [#2] PREEMPT DEBUG_PAGEALLOC
> [  601.007716] last sysfs file: /sys/block/ram9/range
> [  601.007716] Modules linked in: [last unloaded: rcutorture]
> [  601.007716] 
> [  601.007716] Pid: 5020, comm: rcu_torture_fak Tainted: G      D W
> (2.6.28-05692-g7d3b56b-dirty #169) System Name
> [  601.007716] EIP: 0060:[<d0af8a2d>] EFLAGS: 00010296 CPU: 0
> [  601.007716] EIP is at 0xd0af8a2d
> [  601.007716] EAX: 00000000 EBX: c98adfc4 ECX: c04f2c53 EDX: c98adf74
> [  601.007716] ESI: d0af8c0a EDI: 00000000 EBP: c98adfbc ESP: c98adfbc
> [  601.007716]  DS: 007b ES: 007b FS: 0000 GS: 0000 SS: 0068
> [  601.007716] Process rcu_torture_fak (pid: 5020, ti=c98ad000
> task=cbb62700 task.ti=c98ad000)
> [  601.007716] Stack:
> [  601.007716]  c98adfd0 d0af8c89 e5bc239c 000024ab 00000000 c98adfe0
> c0136bcc c0136b8e
> [  601.007716]  00000000 00000000 c0103643 c9897ea4 00000000 00000000
> 00000000 00000201
> [  601.007716]  00000674
> [  601.007716] Call Trace:
> [  601.007716]  [<c0136bcc>] ? kthread+0x3e/0x66
> [  601.007716]  [<c0136b8e>] ? kthread+0x0/0x66
> [  601.007716]  [<c0103643>] ? kernel_thread_helper+0x7/0x10
> [  601.007716] Code:  Bad EIP value.
> [  601.007716] EIP: [<d0af8a2d>] 0xd0af8a2d SS:ESP 0068:c98adfbc
> [  601.007716] ---[ end trace 4eaa2a86a8e2da22 ]---
> [  601.013281] BUG: unable to handle kernel paging request at d0af8a2d
> [  601.013542] IP: [<d0af8a2d>] 0xd0af8a2d
> [  601.013785] *pde = 0fbb4067 *pte = 00000000 
> [  601.014032] Oops: 0000 [#3] PREEMPT DEBUG_PAGEALLOC
> [  601.014239] last sysfs file: /sys/block/ram9/range
> [  601.014239] Modules linked in: [last unloaded: rcutorture]
> [  601.014239] 
> [  601.014239] Pid: 5019, comm: rcu_torture_fak Tainted: G      D W
> (2.6.28-05692-g7d3b56b-dirty #169) System Name
> [  601.014239] EIP: 0060:[<d0af8a2d>] EFLAGS: 00010286 CPU: 0
> [  601.014239] EIP is at 0xd0af8a2d
> [  601.014239] EAX: 00000000 EBX: c98b2fc4 ECX: c04f2c53 EDX: c98b2f74
> [  601.014239] ESI: d0af8c0a EDI: 00000000 EBP: c98b2fbc ESP: c98b2fbc
> [  601.014239]  DS: 007b ES: 007b FS: 0000 GS: 0000 SS: 0068
> [  601.014239] Process rcu_torture_fak (pid: 5019, ti=c98b2000
> task=c9804e00 task.ti=c98b2000)
> [  601.014239] Stack:
> [  601.014239]  c98b2fd0 d0af8c89 61a1df8f 000024a5 00000000 c98b2fe0
> c0136bcc c0136b8e
> [  601.014239]  00000000 00000000 c0103643 c9897ea4 00000000 00000000
> 00000000 000007bf
> [  601.014239]  00000012
> [  601.014239] Call Trace:
> [  601.014239]  [<c0136bcc>] ? kthread+0x3e/0x66
> [  601.014239]  [<c0136b8e>] ? kthread+0x0/0x66
> [  601.014239]  [<c0103643>] ? kernel_thread_helper+0x7/0x10
> [  601.014239] Code:  Bad EIP value.
> [  601.014239] EIP: [<d0af8a2d>] 0xd0af8a2d SS:ESP 0068:c98b2fbc
> [  601.014239] ---[ end trace 4eaa2a86a8e2da22 ]---
> 
> then i added the rcutree patch (Fix rcutree grace-period-latency bug on
> small systems) and got
> 
> [   65.968568] rcu-torture:--- Start of test: nreaders=2 nfakewriters=4
> stat_interval=0 verbose=0 test_no_idle_hz=0 shuffle_interval=3 stutter=5
> irqreader=1
> [   68.872082] BUG: unable to handle kernel NULL pointer dereference at
> (null)
> [   68.872455] IP: [<(null)>] (null)
> [   68.872653] *pde = 00000000 
> [   68.872849] Oops: 0000 [#1] PREEMPT DEBUG_PAGEALLOC
> [   68.873027] last sysfs file: /sys/block/ram9/range
> [   68.873027] Modules linked in: [last unloaded: rcutorture]
> [   68.873027] 
> [   68.873027] Pid: 5023, comm: bash Tainted: G        W
> (2.6.28-05692-g7d3b56b-dirty #169) System Name
> [   68.873027] EIP: 0060:[<00000000>] EFLAGS: 00010246 CPU: 0
> [   68.873027] EIP is at 0x0
> [   68.873027] EAX: d0afd0b8 EBX: 00000000 ECX: c01612a6 EDX: 00000006
> [   68.873027] ESI: d0afd0b8 EDI: 0000001c EBP: c0b03fe0 ESP: c0b03fd4
> [   68.873027]  DS: 007b ES: 007b FS: 0000 GS: 0033 SS: 0068
> [   68.873027] Process bash (pid: 5023, ti=c0b03000 task=c98d1a00
> task.ti=ce824000)
> [   68.873027] Stack:
> [   68.873027]  c01612ad 00000200 00000001 c0b03ff8 c012aa97 0000000a
> ce824e98 00000046
> [   68.873027]  c012aa28 ce824ea8 c01042c2
> [   68.873027] Call Trace:
> [   68.873027]  [<c01612ad>] ? rcu_process_callbacks+0x65/0x79
> [   68.873027]  [<c012aa97>] ? __do_softirq+0x6f/0xf6
> [   68.873027]  [<c012aa28>] ? __do_softirq+0x0/0xf6
> [   68.873027]  <IRQ> <0> [<c012a9a5>] ? irq_exit+0x40/0x7c
> [   68.873027]  [<c0110ce1>] ? smp_apic_timer_interrupt+0x68/0x73
> [   68.873027]  [<c0103521>] ? apic_timer_interrupt+0x2d/0x34
> [   68.873027]  [<c07a024c>] ? _spin_lock+0x56/0x5d
> [   68.873027]  [<c07a0000>] ? _write_unlock_bh+0x28/0x37
> [   68.873027]  [<c017ad50>] ? do_wp_page+0x38b/0x453
> [   68.873027]  [<c017be9e>] ? handle_mm_fault+0x4e8/0x54d
> [   68.873027]  [<c0139da3>] ? down_read_trylock+0x3d/0x47
> [   68.873027]  [<c07a25d3>] ? do_page_fault+0x2dc/0x601
> [   68.873027]  [<c07a22f7>] ? do_page_fault+0x0/0x601
> [   68.873027]  [<c07a0bd7>] ? error_code+0x6f/0x74
> [   68.873027] Code:  Bad EIP value.
> [   68.873027] EIP: [<00000000>] 0x0 SS:ESP 0068:c0b03fd4
> [   68.880266] ---[ end trace 4eaa2a86a8e2da22 ]---
> [   68.880394] Kernel panic - not syncing: Fatal exception in interrupt
> 
> 
> Tried again without the rcutree patch and got this one,
> guess its just a timing thing wheter the NULL pointer deref or the other
> appears
> 
> [   99.643517] rcu-torture:--- Start of test: nreaders=2 nfakewriters=4
> stat_interval=0 verbose=0 test_no_idle_hz=0 shuffle_interval=3 stutter=5
> irqreader=1
> [  104.149086] BUG: unable to handle kernel NULL pointer dereference at
> (null)
> [  104.149418] IP: [<(null)>] (null)
> [  104.149615] *pde = 00000000 
> [  104.149813] Oops: 0000 [#1] PREEMPT DEBUG_PAGEALLOC
> [  104.150029] last sysfs file: /sys/block/ram9/range
> [  104.150029] Modules linked in: [last unloaded: rcutorture]
> [  104.150029] 
> [  104.150029] Pid: 5030, comm: udevd Tainted: G        W
> (2.6.28-05692-g7d3b56b-dirty #169) System Name
> [  104.150029] EIP: 0060:[<00000000>] EFLAGS: 00010246 CPU: 0
> [  104.150029] EIP is at 0x0
> [  104.150029] EAX: d0afcc38 EBX: 00000000 ECX: c01612a6 EDX: 00000006
> [  104.150029] ESI: d0afcc38 EDI: 0000001c EBP: c0b03fe0 ESP: c0b03fd4
> [  104.150029]  DS: 007b ES: 007b FS: 0000 GS: 0033 SS: 0068
> [  104.150029] Process udevd (pid: 5030, ti=c0b03000 task=c98cb400
> task.ti=c98b1000)
> [  104.150029] Stack:
> [  104.150029]  c01612ad 00000200 00000001 c0b03ff8 c012aa97 0000000a
> c98b1e40 00000046
> [  104.150029]  c012aa28 c98b1e50 c01042c2
> [  104.150029] Call Trace:
> [  104.150029]  [<c01612ad>] ? rcu_process_callbacks+0x65/0x79
> [  104.150029]  [<c012aa97>] ? __do_softirq+0x6f/0xf6
> [  104.150029]  [<c012aa28>] ? __do_softirq+0x0/0xf6
> [  104.150029]  <IRQ> <0> [<c012a9a5>] ? irq_exit+0x40/0x7c
> [  104.150029]  [<c0110ce1>] ? smp_apic_timer_interrupt+0x68/0x73
> [  104.150029]  [<c0103521>] ? apic_timer_interrupt+0x2d/0x34
> [  104.150029]  [<c016e63f>] ? find_lock_page+0x5f/0x63
> [  104.150029]  [<c0170122>] ? filemap_fault+0x9f/0x391
> [  104.150029]  [<c017a6e8>] ? __do_fault+0x40/0x31d
> [  104.150029]  [<c017bc30>] ? handle_mm_fault+0x27a/0x54d
> [  104.150029]  [<c0139da3>] ? down_read_trylock+0x3d/0x47
> [  104.150029]  [<c07a25d3>] ? do_page_fault+0x2dc/0x601
> [  104.150029]  [<c07a22f7>] ? do_page_fault+0x0/0x601
> [  104.150029]  [<c07a0bd7>] ? error_code+0x6f/0x74
> [  104.150029] Code:  Bad EIP value.
> [  104.150029] EIP: [<00000000>] 0x0 SS:ESP 0068:c0b03fd4
> [  104.157313] ---[ end trace 4eaa2a86a8e2da22 ]---
> [  104.157441] Kernel panic - not syncing: Fatal exception in interrupt
> 
> I enabled classic rcu and pulled current -git (without both patches) and got these:
> 
> [   67.424425] rcu-torture:--- Start of test: nreaders=2 nfakewriters=4
> stat_interval=0 verbose=0 test_no_idle_hz=0 shuffle_interval=3 stutter=5
> irqreader=1
> [   70.205083] BUG: unable to handle kernel NULL pointer dereference at
> (null)
> [   70.205416] IP: [<(null)>] (null)
> [   70.205605] *pde = 00000000 
> [   70.205798] Oops: 0000 [#1] PREEMPT DEBUG_PAGEALLOC
> [   70.206026] last sysfs file: /sys/block/ram9/range
> [   70.206026] Modules linked in: [last unloaded: rcutorture]
> [   70.206026] 
> [   70.206026] Pid: 1726, comm: udevd Tainted: G        W
> (2.6.28-05716-gfe0bdec #170) System Name
> [   70.206026] EIP: 0060:[<00000000>] EFLAGS: 00010202 CPU: 0
> [   70.206026] EIP is at 0x0
> [   70.206026] EAX: d0afc018 EBX: c10697c0 ECX: c0161905 EDX: d0afc018
> [   70.206026] ESI: 00000000 EDI: 00000001 EBP: c0b0bfd8 ESP: c0b0bfc4
> [   70.206026]  DS: 007b ES: 007b FS: 0000 GS: 0033 SS: 0068
> [   70.206026] Process udevd (pid: 1726, ti=c0b0b000 task=cee10000
> task.ti=ce80b000)
> [   70.206026] Stack:
> [   70.206026]  c016197c 00000202 00000200 00000001 0000001c c0b0bfe0
> c0161a00 c0b0bff8
> [   70.206026]  c012acca 0000000a ce80bec4 00000046 c012ac5b ce80bed4
> c01042c2
> [   70.206026] Call Trace:
> [   70.206026]  [<c016197c>] ? __rcu_process_callbacks+0x141/0x1ae
> [   70.206026]  [<c0161a00>] ? rcu_process_callbacks+0x17/0x28
> [   70.206026]  [<c012acca>] ? __do_softirq+0x6f/0x100
> [   70.206026]  [<c012ac5b>] ? __do_softirq+0x0/0x100
> [   70.206026]  <IRQ> <0> [<c012abdd>] ? irq_exit+0x40/0x77
> [   70.206026]  [<c0110ce9>] ? smp_apic_timer_interrupt+0x68/0x73
> [   70.206026]  [<c0103521>] ? apic_timer_interrupt+0x2d/0x34
> [   70.206026]  [<c014007b>] ? tick_do_periodic_broadcast+0xf/0x3a
> [   70.206026]  [<c018f325>] ? kmem_cache_free+0xbd/0xc5
> [   70.206026]  [<c019a45d>] ? putname+0x28/0x33
> [   70.206026]  [<c019a45d>] ? putname+0x28/0x33
> [   70.206026]  [<c019b7a0>] ? do_rmdir+0xba/0xc3
> [   70.206026]  [<c04e8b50>] ? trace_hardirqs_on_thunk+0xc/0x10
> [   70.206026]  [<c0102ecf>] ? sysenter_exit+0xf/0x16
> [   70.206026]  [<c01448f1>] ? trace_hardirqs_on_caller+0x17/0x15a
> [   70.206026]  [<c019b7f2>] ? sys_rmdir+0x15/0x17
> [   70.206026]  [<c0102ea1>] ? sysenter_do_call+0x12/0x31
> [   70.206026] Code:  Bad EIP value.
> [   70.206026] EIP: [<00000000>] 0x0 SS:ESP 0068:c0b0bfc4
> [   70.213814] ---[ end trace 4eaa2a86a8e2da22 ]---
> [   70.213934] Kernel panic - not syncing: Fatal exception in interrupt
> 
> 
> 
> 
> On the other box with tree rcu and the rcutorture patch (200.logs):
> 
> 
> [  168.408126] BUG: unable to handle kernel NULL pointer dereference at
> 00000029
> [  168.408433] IP: [<d1902ebf>] 0xd1902ebf
> [  168.408614] *pde = 00000000 
> [  168.408778] Oops: 0000 [#1] DEBUG_PAGEALLOC
> [  168.408978] last sysfs file: /sys/block/sda/size
> [  168.409164] Modules linked in: nfsd exportfs nfs lockd nfs_acl
> auth_rpcgss sunrpc ipv6 fuse unix [last unloaded: rcutorture]
> [  168.410099] 
> [  168.410197] Pid: 4154, comm: rcu_torture_rea Not tainted (2.6.28 #83) 
> [  168.410332] EIP: 0060:[<d1902ebf>] EFLAGS: 00010202 CPU: 0
> [  168.410516] EIP is at 0xd1902ebf
> [  168.410628] EAX: 00000001 EBX: 00000001 ECX: 00000000 EDX: 00000000
> [  168.410760] ESI: 00000000 EDI: 00000000 EBP: c7d52fd0 ESP: c7d52f84
> [  168.410992]  DS: 007b ES: 007b FS: 0000 GS: 0000 SS: 0068
> [  168.411120] Process rcu_torture_rea (pid: 4154, ti=c7d52000
> task=c7b73710 task.ti=c7d52000)
> [  168.411302] Stack:
> [  168.411398]  00000002 00000000 c0936a24 c7b48f8c 00000001 d1903040
> 00000000 c0936000
> [  168.411981]  d1902ff7 5f756372 74726f74 5f657275 00616572 0000103a
> 2e11c1af 00000dd1
> [  168.412021]  00000000 d1902e50 00000000 c7d52fe0 c013d81a c013d7e0
> 00000000 00000000
> [  168.412021] Call Trace:
> [  168.412021]  [<c013d81a>] ? kthread+0x3a/0x70
> [  168.412021]  [<c013d7e0>] ? kthread+0x0/0x70
> [  168.412021]  [<c0103cf3>] ? kernel_thread_helper+0x7/0x14
> [  168.412021] Code: 28 85 f6 74 1c 8d 45 bc c7 45 c8 40 30 90 d1 c7 45
> cc 00 00 00 00 e8 91 03 83 ee 90 a1 00 5e 90 d1 8b 1d a4 59 90 d1 85 db
> 74 12 <8b> 48 28 85 c9 74 0b 8b 55 bc 85 d2 0f 84 19 01 00 00 8d b4 26 
> [  168.412021] EIP: [<d1902ebf>] 0xd1902ebf SS:ESP 0068:c7d52f84
> [  168.418864] ---[ end trace 4bbd74c2c853213b ]---
> [  168.429775] __call_rcu: rcu_head=c3f47f24, func=c01b5a70
> [  168.429958] __call_rcu: rcu_head=c7bbecac, func=c0143580
> [  168.430129] __call_rcu: rcu_head=c7dbeea0, func=c01a6ae0
> [  168.431826] __call_rcu: rcu_head=c7de9328, func=c01a6ae0
> [  168.432089] rcu_do_batch: rcu_head=c3f47f24, func=c01b5a70
> [  168.432869] __call_rcu: rcu_head=c994fe18, func=c01b9c20
> [  168.436064] rcu_do_batch: rcu_head=c7bbecac, func=c0143580
> [  168.436226] rcu_do_batch: rcu_head=c7dbeea0, func=c01a6ae0
> [  168.444046] rcu_do_batch: rcu_head=c7de9328, func=c01a6ae0
> [  168.444193] rcu_do_batch: rcu_head=c994fe18, func=c01b9c20
> [  168.445866] BUG: unable to handle kernel NULL pointer dereference at
> 00000029
> [  168.446126] IP: [<d1902ebf>] 0xd1902ebf
> [  168.446315] *pde = 00000000 
> [  168.446488] Oops: 0000 [#2] DEBUG_PAGEALLOC
> [  168.446692] last sysfs file: /sys/block/sda/size
> [  168.446883] Modules linked in: nfsd exportfs nfs lockd nfs_acl
> auth_rpcgss sunrpc ipv6 fuse unix [last unloaded: rcutorture]
> [  168.447818] 
> [  168.447916] Pid: 4153, comm: rcu_torture_rea Tainted: G      D
> (2.6.28 #83) 
> [  168.448022] EIP: 0060:[<d1902ebf>] EFLAGS: 00010202 CPU: 0
> [  168.448022] EIP is at 0xd1902ebf
> [  168.448022] EAX: 00000001 EBX: 00000001 ECX: 00000000 EDX: 00000000
> [  168.448022] ESI: 00000000 EDI: 00000000 EBP: c7b48fd0 ESP: c7b48f84
> [  168.448022]  DS: 007b ES: 007b FS: 0000 GS: 0000 SS: 0068
> [  168.448022] Process rcu_torture_rea (pid: 4153, ti=c7b48000
> task=c7b5c710 task.ti=c7b48000)
> [  168.448022] Stack:
> [  168.448022]  00000002 00000000 c7d52f8c c7de5f74 00000001 d1903040
> 00000000 c0936000
> [  168.448022]  d1902ff7 5f756372 74726f74 5f657275 00616572 00001039
> b87c4524 00000ca3
> [  168.448022]  00000000 d1902e50 00000000 c7b48fe0 c013d81a c013d7e0
> 00000000 00000000
> [  168.448022] Call Trace:
> [  168.448022]  [<c013d81a>] ? kthread+0x3a/0x70
> [  168.448022]  [<c013d7e0>] ? kthread+0x0/0x70
> [  168.448022]  [<c0103cf3>] ? kernel_thread_helper+0x7/0x14
> [  168.448022] Code: 28 85 f6 74 1c 8d 45 bc c7 45 c8 40 30 90 d1 c7 45
> cc 00 00 00 00 e8 91 03 83 ee 90 a1 00 5e 90 d1 8b 1d a4 5[  168.457352]
> BUG: unable to handle kernel NULL pointer dereference at 0000001d
> [  168.457603] IP: [<d1903163>] 0xd1903163
> [  168.457776] *pde = 00000000 
> [  168.457942] Oops: 0000 [#3] DEBUG_PAGEALLOC
> [  168.458141] last sysfs file: /sys/block/sda/size
> [  168.458323] Modules linked in: nfsd exportfs nfs lockd nfs_acl
> auth_rpcgss sunrpc ipv6 fuse unix [last unloaded: rcutorture]
> [  168.459254] 
> [  168.459350] Pid: 4151, comm: rcu_torture_fak Tainted: G      D
> (2.6.28 #83) 
> [  168.459523] EIP: 0060:[<d1903163>] EFLAGS: 00010287 CPU: 0
> [  168.459706] EIP is at 0xd1903163
> [  168.459818] EAX: 00000001 EBX: c7dcafc0 ECX: 35502ab8 EDX: 00000021
> [  168.459949] ESI: cccccccd EDI: 00000000 EBP: c7dcafd0 ESP: c7dcafb8
> [  168.460080]  DS: 007b ES: 007b FS: 0000 GS: 0000 SS: 0068
> [  168.460262] Process rcu_torture_fak (pid: 4151, ti=c7dca000
> task=c7dcb710 task.ti=c7dca000)
> [  168.460442] Stack:
> [  168.460485]  c7dbadd8 00000000 01f30e05 00002699 00000000 d19030f0
> c7dcafe0 c013d81a
> [  168.460485]  c013d7e0 00000000 00000000 c0103cf3 c7ca9e70 00000000
> 00000000 00000000
> [  168.460485]  20df0493 0a860000
> [  168.460485] Call Trace:
> [  168.460485]  [<c013d81a>] ? kthread+0x3a/0x70
> [  168.460485]  [<c013d7e0>] ? kthread+0x0/0x70
> [  168.460485]  [<c0103cf3>] ? kernel_thread_helper+0x7/0x14
> [  168.460485] Code: c1 f7 e6 c1 ea 03 8d 14 92 01 d2 29 d1 8d 41 01 e8
> 23 83 ca ee 89 d8 e8 1c f7 ff ff 25 ff 03 00 00 e8 f2 c7 99 ee a1 00 5e
> 90 d1 <ff> 50 1c e8 75 fc ff ff e8 d0 a4 83 ee 85 c0 8d b6 00 00 00 00 
> [  168.460485] EIP: [<d1903163>] 0xd1903163 SS:ESP 0068:c7dcafb8
> [  168.467531] ---[ end trace 4bbd74c2c853213b ]---
> [  168.476855] BUG: unable to handle kernel NULL pointer dereference at
> 00000004
> [  168.477134] IP: [<d1903233>] 0xd1903233
> [  168.477322] *pde = 00000000 
> [  168.477485] Oops: 0002 [#4] DEBUG_PAGEALLOC
> [  168.477680] last sysfs file: /sys/block/sda/size
> [  168.477857] Modules linked in: nfsd exportfs nfs lockd nfs_acl
> auth_rpcgss sunrpc ipv6 fuse unix [last unloaded: rcutorture]
> [  168.478774] 
> [  168.478871] Pid: 4147, comm: rcu_torture_wri Tainted: G      D
> (2.6.28 #83) 
> [  168.479045] EIP: 0060:[<d1903233>] EFLAGS: 00010202 CPU: 0
> [  168.479228] EIP is at 0xd1903233
> [  168.479340] EAX: 00000000 EBX: d19073fc ECX: 00000001 EDX: 00000000
> [  168.479472] ESI: 0000000a EDI: 00000000 EBP: c7e05fd0 ESP: c7e05fc0
> [  168.479604]  DS: 007b ES: 007b FS: 0000 GS: 0000 SS: 0068
> [  168.479787] Process rcu_torture_wri (pid: 4147, ti=c7e05000
> task=c994e710 task.ti=c7e05000)
> [  168.479968] Stack:
> [  168.480015]  00000000 00000000 00000000 d19031f0 c7e05fe0 c013d81a
> c013d7e0 00000000
> [  168.480015]  00000000 c0103cf3 c7ca9e70 00000000 00000000 00000000
> 7574726f 725f6572
> [  168.480015] Call Trace:
> [  168.480015]  [<c013d81a>] ? kthread+0x3a/0x70
> [  168.480015]  [<c013d7e0>] ? kthread+0x0/0x70
> [  168.480015]  [<c0103cf3>] ? kernel_thread_helper+0x7/0x14
> [  168.480015] Code: 01 00 00 ba 13 00 00 00 be 0a 00 00 00 a1 00 20 83
> c0 e8 a1 21 82 ee e9 a0 00 00 00 8d 74 26 00 ff 05 6c 78 90 d1 8b 43 04
> 8b 13 <89> 42 04 89 10 b8 cc 59 90 d1 89 1b 89 5b 04 e8 19 a3 ca ee 83 
> [  168.480015] EIP: [<d1903233>] 0xd1903233 SS:ESP 0068:c7e05fc0
> [  168.486680] ---[ end trace 4bbd74c2c853213b ]---
> [  168.486798] Kernel panic - not syncing: Fatal exception in interrupt
> 9 90 d1 85 db
> 74 12 <8b> 48 28 85 c9 74 0b 8b 55 bc 85 d2 0f 84 19 01 00 00 8d b4 26 
> [  168.448022] EIP: [<d1902ebf>] 0xd1902ebf SS:ESP 0068:c7b48f84
> [  168.456684] ---[ end trace 4bbd74c2c853213b ]---
> 
> 
> Greetings, Eric

^ permalink raw reply	[flat|nested] 27+ messages in thread

* Re: [BUG] NULL pointer deref with rcutorture
  2009-01-05 18:00                         ` Paul E. McKenney
@ 2009-01-05 18:56                           ` Eric Sesterhenn
  2009-01-05 19:36                             ` Paul E. McKenney
  0 siblings, 1 reply; 27+ messages in thread
From: Eric Sesterhenn @ 2009-01-05 18:56 UTC (permalink / raw)
  To: Paul E. McKenney; +Cc: Kamalesh Babulal, linux-kernel, josh, dipankar

* Paul E. McKenney (paulmck@linux.vnet.ibm.com) wrote:
> On Mon, Jan 05, 2009 at 01:14:09PM +0100, Eric Sesterhenn wrote:
> > * Paul E. McKenney (paulmck@linux.vnet.ibm.com) wrote:
> > 
> > Could the popular rcu function be registered by rcutorture, but when
> > we remove the module the callback is no longer valid? I can compile
> > a kernel just fine and with other stress tests i did not see any oops so
> > far.
> 
> One approach would be to print out the address of rcutorture's RCU
> callbacks at rcutorture module initialization time (in rcu_torture_init()
> in kernel/rcutorture.c).  The two callbacks are rcu_torture_cb() and
> rcu_bh_torture_wakeme_after_cb().  Unless you are specifying the 
> "torture_type" parameter to rcutorture, only the first one should be in
> use.

with a printk(KERN_ERR "rcu_torture_cb: %p rcu_bh_torture_wakeme_after_cb:
%p\n", rcu_torture_cb, rcu_bh_torture_wakeme_after_cb);



[   65.135468] rcu_torture_cb: d0af7d1b rcu_bh_torture_wakeme_after_cb:
d0af7bec
[   65.135672] rcu-torture:--- Start of test: nreaders=2 nfakewriters=4
stat_interval=0 verbose=0 test_no_idle_hz=0 shuffle_interval=3 stutter=5
irqreader=1
[   71.171603] BUG: unable to handle kernel NULL pointer dereference at
(null)
[   71.171954] IP: [<d0af7a0f>] 0xd0af7a0f
[   71.192822] *pde = 00000000 
[   71.196513] Oops: 0002 [#1] PREEMPT DEBUG_PAGEALLOC
[   71.196826] last sysfs file: /sys/block/ram9/range
[   71.197010] Modules linked in: [last unloaded: rcutorture]
[   71.197010] 
[   71.197010] Pid: 4861, comm: rcu_torture_wri Tainted: G        W
(2.6.28-05716-gfe0bdec-dirty #171) System Name
[   71.197010] EIP: 0060:[<d0af7a0f>] EFLAGS: 00010282 CPU: 0
[   71.197010] EIP is at 0xd0af7a0f
[   71.197010] EAX: 00000000 EBX: d0afbc20 ECX: c04f5cef EDX: c98abf7c
[   71.197010] ESI: d0af7df0 EDI: 00000000 EBP: c98abfc4 ESP: c98abfc4
[   71.197010]  DS: 007b ES: 007b FS: 0000 GS: 0000 SS: 0068
[   71.197010] Process rcu_torture_wri (pid: 4861, ti=c98ab000
task=c9890d00 task.ti=c98ab000)
[   71.197010] Stack:
[   71.197010]  c98abfd0 d0af7eeb 00000000 c98abfe0 c0137364 c0137326
00000000 00000000
[   71.197010]  c0103643 c981fea4 00000000 00000000 00000000 00000000
00000000
[   71.197010] Call Trace:
[   71.197010]  [<c0137364>] ? kthread+0x3e/0x66
[   71.197010]  [<c0137326>] ? kthread+0x0/0x66
[   71.197010]  [<c0103643>] ? kernel_thread_helper+0x7/0x10
[   71.197010] Code: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
00 00 <00> 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 
[   71.197010] EIP: [<d0af7a0f>] 0xd0af7a0f SS:ESP 0068:c98abfc4
[   71.301103] ---[ end trace 4eaa2a86a8e2da22 ]---

If i interpret this correctly, this corresponds to

000009e8 <rcu_stutter_wait>:
     9e8:       55                      push   %ebp
     9e9:       89 e5                   mov    %esp,%ebp
     9eb:       e8 fc ff ff ff          call   9ec <rcu_stutter_wait+0x4>
     9f0:       eb 1d                   jmp    a0f <rcu_stutter_wait+0x27>
     9f2:       83 3d 00 00 00 00 00    cmpl   $0x0,0x0
     9f9:       b8 01 00 00 00          mov    $0x1,%eax
     9fe:       75 0a                   jne    a0a <rcu_stutter_wait+0x22>
     a00:       b8 e8 03 00 00          mov    $0x3e8,%eax
     a05:       e8 fc ff ff ff          call   a06 <rcu_stutter_wait+0x1e>
     a0a:       e8 fc ff ff ff          call   a0b <rcu_stutter_wait+0x23>
     a0f:       83 3d 6c 00 00 00 00    cmpl   $0x0,0x6c
			^---------- this line
     a16:       75 09                   jne    a21 <rcu_stutter_wait+0x39>
     a18:       83 3d 00 00 00 00 00    cmpl   $0x0,0x0
     a1f:       75 09                   jne    a2a <rcu_stutter_wait+0x42>
     a21:       83 3d 50 1a 00 00 00    cmpl   $0x0,0x1a50
     a28:       74 c8                   je     9f2 <rcu_stutter_wait+0xa>
     a2a:       5d                      pop    %ebp
     a2b:       c3                      ret

Greetings, Eric

^ permalink raw reply	[flat|nested] 27+ messages in thread

* Re: [BUG] NULL pointer deref with rcutorture
  2009-01-05 18:56                           ` Eric Sesterhenn
@ 2009-01-05 19:36                             ` Paul E. McKenney
  2009-01-05 20:01                               ` Eric Sesterhenn
  0 siblings, 1 reply; 27+ messages in thread
From: Paul E. McKenney @ 2009-01-05 19:36 UTC (permalink / raw)
  To: Eric Sesterhenn; +Cc: Kamalesh Babulal, linux-kernel, josh, dipankar

On Mon, Jan 05, 2009 at 07:56:55PM +0100, Eric Sesterhenn wrote:
> * Paul E. McKenney (paulmck@linux.vnet.ibm.com) wrote:
> > On Mon, Jan 05, 2009 at 01:14:09PM +0100, Eric Sesterhenn wrote:
> > > * Paul E. McKenney (paulmck@linux.vnet.ibm.com) wrote:
> > > 
> > > Could the popular rcu function be registered by rcutorture, but when
> > > we remove the module the callback is no longer valid? I can compile
> > > a kernel just fine and with other stress tests i did not see any oops so
> > > far.
> > 
> > One approach would be to print out the address of rcutorture's RCU
> > callbacks at rcutorture module initialization time (in rcu_torture_init()
> > in kernel/rcutorture.c).  The two callbacks are rcu_torture_cb() and
> > rcu_bh_torture_wakeme_after_cb().  Unless you are specifying the 
> > "torture_type" parameter to rcutorture, only the first one should be in
> > use.
> 
> with a printk(KERN_ERR "rcu_torture_cb: %p rcu_bh_torture_wakeme_after_cb:
> %p\n", rcu_torture_cb, rcu_bh_torture_wakeme_after_cb);

Cool!

> [   65.135468] rcu_torture_cb: d0af7d1b rcu_bh_torture_wakeme_after_cb:
> d0af7bec
> [   65.135672] rcu-torture:--- Start of test: nreaders=2 nfakewriters=4
> stat_interval=0 verbose=0 test_no_idle_hz=0 shuffle_interval=3 stutter=5
> irqreader=1
> [   71.171603] BUG: unable to handle kernel NULL pointer dereference at
> (null)
> [   71.171954] IP: [<d0af7a0f>] 0xd0af7a0f
> [   71.192822] *pde = 00000000 
> [   71.196513] Oops: 0002 [#1] PREEMPT DEBUG_PAGEALLOC
> [   71.196826] last sysfs file: /sys/block/ram9/range
> [   71.197010] Modules linked in: [last unloaded: rcutorture]
> [   71.197010] 
> [   71.197010] Pid: 4861, comm: rcu_torture_wri Tainted: G        W
> (2.6.28-05716-gfe0bdec-dirty #171) System Name
> [   71.197010] EIP: 0060:[<d0af7a0f>] EFLAGS: 00010282 CPU: 0
> [   71.197010] EIP is at 0xd0af7a0f
> [   71.197010] EAX: 00000000 EBX: d0afbc20 ECX: c04f5cef EDX: c98abf7c
> [   71.197010] ESI: d0af7df0 EDI: 00000000 EBP: c98abfc4 ESP: c98abfc4
> [   71.197010]  DS: 007b ES: 007b FS: 0000 GS: 0000 SS: 0068
> [   71.197010] Process rcu_torture_wri (pid: 4861, ti=c98ab000
> task=c9890d00 task.ti=c98ab000)
> [   71.197010] Stack:
> [   71.197010]  c98abfd0 d0af7eeb 00000000 c98abfe0 c0137364 c0137326
> 00000000 00000000
> [   71.197010]  c0103643 c981fea4 00000000 00000000 00000000 00000000
> 00000000
> [   71.197010] Call Trace:
> [   71.197010]  [<c0137364>] ? kthread+0x3e/0x66
> [   71.197010]  [<c0137326>] ? kthread+0x0/0x66
> [   71.197010]  [<c0103643>] ? kernel_thread_helper+0x7/0x10
> [   71.197010] Code: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
> 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
> 00 00 <00> 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 
> [   71.197010] EIP: [<d0af7a0f>] 0xd0af7a0f SS:ESP 0068:c98abfc4
> [   71.301103] ---[ end trace 4eaa2a86a8e2da22 ]---
> 
> If i interpret this correctly, this corresponds to
> 
> 000009e8 <rcu_stutter_wait>:
>      9e8:       55                      push   %ebp
>      9e9:       89 e5                   mov    %esp,%ebp
>      9eb:       e8 fc ff ff ff          call   9ec <rcu_stutter_wait+0x4>

Wow!!!  Am I reading this correctly?  Does the above "call" instruction
-really- call one byte into itself?  That is what the hex for the x86
instruction -looks- like it is doing, but I cannot see what would have
possessed the compiler to generate this code.

When I compile on a 32-bit x86 machine, I don't see the above "call"
instruction.  Other than that, the code I see looks consistent.

>      9f0:       eb 1d                   jmp    a0f <rcu_stutter_wait+0x27>
>      9f2:       83 3d 00 00 00 00 00    cmpl   $0x0,0x0
>      9f9:       b8 01 00 00 00          mov    $0x1,%eax
>      9fe:       75 0a                   jne    a0a <rcu_stutter_wait+0x22>
>      a00:       b8 e8 03 00 00          mov    $0x3e8,%eax
>      a05:       e8 fc ff ff ff          call   a06 <rcu_stutter_wait+0x1e>
>      a0a:       e8 fc ff ff ff          call   a0b <rcu_stutter_wait+0x23>
>      a0f:       83 3d 6c 00 00 00 00    cmpl   $0x0,0x6c
> 			^---------- this line

This looks like the first test in the "while" loop.

>      a16:       75 09                   jne    a21 <rcu_stutter_wait+0x39>
>      a18:       83 3d 00 00 00 00 00    cmpl   $0x0,0x0
>      a1f:       75 09                   jne    a2a <rcu_stutter_wait+0x42>
>      a21:       83 3d 50 1a 00 00 00    cmpl   $0x0,0x1a50
>      a28:       74 c8                   je     9f2 <rcu_stutter_wait+0xa>
>      a2a:       5d                      pop    %ebp
>      a2b:       c3                      ret

The corresponding C code is as follows:

static void
rcu_stutter_wait(void)
{
	while ((stutter_pause_test || !rcutorture_runnable) && !fullstop) {
		if (rcutorture_runnable)
			schedule_timeout_interruptible(1);
		else
			schedule_timeout_interruptible(round_jiffies_relative(HZ));
	}
}

I don't see much opportunity for a page fault here...  This is the
binary I get when I compile it, though not as a module:

0000085a <rcu_stutter_wait>:
     85a:	55                   	push   %ebp
     85b:	89 e5                	mov    %esp,%ebp
     85d:	eb 1d                	jmp    87c <rcu_stutter_wait+0x22>
     85f:	83 3d 00 00 00 00 00 	cmpl   $0x0,0x0
     866:	b8 01 00 00 00       	mov    $0x1,%eax
     86b:	75 0a                	jne    877 <rcu_stutter_wait+0x1d>
     86d:	b8 e8 03 00 00       	mov    $0x3e8,%eax
     872:	e8 fc ff ff ff       	call   873 <rcu_stutter_wait+0x19>
     877:	e8 fc ff ff ff       	call   878 <rcu_stutter_wait+0x1e>
     87c:	83 3d 14 00 00 00 00 	cmpl   $0x0,0x14
     883:	75 09                	jne    88e <rcu_stutter_wait+0x34>
     885:	83 3d 00 00 00 00 00 	cmpl   $0x0,0x0
     88c:	75 09                	jne    897 <rcu_stutter_wait+0x3d>
     88e:	83 3d 08 1a 00 00 00 	cmpl   $0x0,0x1a08
     895:	74 c8                	je     85f <rcu_stutter_wait+0x5>
     897:	5d                   	pop    %ebp
     898:	c3                   	ret    

I confess, I am confused!!!

							Thanx, Paul

^ permalink raw reply	[flat|nested] 27+ messages in thread

* Re: [BUG] NULL pointer deref with rcutorture
  2009-01-05 19:36                             ` Paul E. McKenney
@ 2009-01-05 20:01                               ` Eric Sesterhenn
  2009-01-05 20:16                                 ` Paul E. McKenney
  0 siblings, 1 reply; 27+ messages in thread
From: Eric Sesterhenn @ 2009-01-05 20:01 UTC (permalink / raw)
  To: Paul E. McKenney; +Cc: Kamalesh Babulal, linux-kernel, josh, dipankar

hi,

* Paul E. McKenney (paulmck@linux.vnet.ibm.com) wrote:
> On Mon, Jan 05, 2009 at 07:56:55PM +0100, Eric Sesterhenn wrote:
> > * Paul E. McKenney (paulmck@linux.vnet.ibm.com) wrote:
> > [   65.135468] rcu_torture_cb: d0af7d1b rcu_bh_torture_wakeme_after_cb:
> > d0af7bec
> > [   65.135672] rcu-torture:--- Start of test: nreaders=2 nfakewriters=4
> > stat_interval=0 verbose=0 test_no_idle_hz=0 shuffle_interval=3 stutter=5
> > irqreader=1
> > [   71.171603] BUG: unable to handle kernel NULL pointer dereference at
> > (null)
> > [   71.171954] IP: [<d0af7a0f>] 0xd0af7a0f
> > [   71.192822] *pde = 00000000 
> > [   71.196513] Oops: 0002 [#1] PREEMPT DEBUG_PAGEALLOC
> > [   71.196826] last sysfs file: /sys/block/ram9/range
> > [   71.197010] Modules linked in: [last unloaded: rcutorture]
> > [   71.197010] 
> > [   71.197010] Pid: 4861, comm: rcu_torture_wri Tainted: G        W
> > (2.6.28-05716-gfe0bdec-dirty #171) System Name
> > [   71.197010] EIP: 0060:[<d0af7a0f>] EFLAGS: 00010282 CPU: 0
> > [   71.197010] EIP is at 0xd0af7a0f
> > [   71.197010] EAX: 00000000 EBX: d0afbc20 ECX: c04f5cef EDX: c98abf7c
> > [   71.197010] ESI: d0af7df0 EDI: 00000000 EBP: c98abfc4 ESP: c98abfc4
> > [   71.197010]  DS: 007b ES: 007b FS: 0000 GS: 0000 SS: 0068
> > [   71.197010] Process rcu_torture_wri (pid: 4861, ti=c98ab000
> > task=c9890d00 task.ti=c98ab000)
> > [   71.197010] Stack:
> > [   71.197010]  c98abfd0 d0af7eeb 00000000 c98abfe0 c0137364 c0137326
> > 00000000 00000000
> > [   71.197010]  c0103643 c981fea4 00000000 00000000 00000000 00000000
> > 00000000
> > [   71.197010] Call Trace:
> > [   71.197010]  [<c0137364>] ? kthread+0x3e/0x66
> > [   71.197010]  [<c0137326>] ? kthread+0x0/0x66
> > [   71.197010]  [<c0103643>] ? kernel_thread_helper+0x7/0x10
> > [   71.197010] Code: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
> > 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
> > 00 00 <00> 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 
> > [   71.197010] EIP: [<d0af7a0f>] 0xd0af7a0f SS:ESP 0068:c98abfc4
> > [   71.301103] ---[ end trace 4eaa2a86a8e2da22 ]---
> > 
> > If i interpret this correctly, this corresponds to
> > 
> > 000009e8 <rcu_stutter_wait>:
> >      9e8:       55                      push   %ebp
> >      9e9:       89 e5                   mov    %esp,%ebp
> >      9eb:       e8 fc ff ff ff          call   9ec <rcu_stutter_wait+0x4>
> 
> Wow!!!  Am I reading this correctly?  Does the above "call" instruction
> -really- call one byte into itself?  That is what the hex for the x86
> instruction -looks- like it is doing, but I cannot see what would have
> possessed the compiler to generate this code.

Compiler is gcc version 4.2.4 (Ubuntu 4.2.4-1ubuntu3)

 
> When I compile on a 32-bit x86 machine, I don't see the above "call"
> instruction.  Other than that, the code I see looks consistent.
> 
> >      9f0:       eb 1d                   jmp    a0f <rcu_stutter_wait+0x27>
> >      9f2:       83 3d 00 00 00 00 00    cmpl   $0x0,0x0
> >      9f9:       b8 01 00 00 00          mov    $0x1,%eax
> >      9fe:       75 0a                   jne    a0a <rcu_stutter_wait+0x22>
> >      a00:       b8 e8 03 00 00          mov    $0x3e8,%eax
> >      a05:       e8 fc ff ff ff          call   a06 <rcu_stutter_wait+0x1e>
> >      a0a:       e8 fc ff ff ff          call   a0b <rcu_stutter_wait+0x23>
> >      a0f:       83 3d 6c 00 00 00 00    cmpl   $0x0,0x6c
> > 			^---------- this line
> 
> This looks like the first test in the "while" loop.
> 
> >      a16:       75 09                   jne    a21 <rcu_stutter_wait+0x39>
> >      a18:       83 3d 00 00 00 00 00    cmpl   $0x0,0x0
> >      a1f:       75 09                   jne    a2a <rcu_stutter_wait+0x42>
> >      a21:       83 3d 50 1a 00 00 00    cmpl   $0x0,0x1a50
> >      a28:       74 c8                   je     9f2 <rcu_stutter_wait+0xa>
> >      a2a:       5d                      pop    %ebp
> >      a2b:       c3                      ret
> 
> The corresponding C code is as follows:
> 
> static void
> rcu_stutter_wait(void)
> {
> 	while ((stutter_pause_test || !rcutorture_runnable) && !fullstop) {
> 		if (rcutorture_runnable)
> 			schedule_timeout_interruptible(1);
> 		else
> 			schedule_timeout_interruptible(round_jiffies_relative(HZ));
> 	}
> }
> 
> I don't see much opportunity for a page fault here...  This is the
> binary I get when I compile it, though not as a module:
> 
> 0000085a <rcu_stutter_wait>:
>      85a:	55                   	push   %ebp
>      85b:	89 e5                	mov    %esp,%ebp
>      85d:	eb 1d                	jmp    87c <rcu_stutter_wait+0x22>
>      85f:	83 3d 00 00 00 00 00 	cmpl   $0x0,0x0
>      866:	b8 01 00 00 00       	mov    $0x1,%eax
>      86b:	75 0a                	jne    877 <rcu_stutter_wait+0x1d>
>      86d:	b8 e8 03 00 00       	mov    $0x3e8,%eax
>      872:	e8 fc ff ff ff       	call   873 <rcu_stutter_wait+0x19>
>      877:	e8 fc ff ff ff       	call   878 <rcu_stutter_wait+0x1e>
>      87c:	83 3d 14 00 00 00 00 	cmpl   $0x0,0x14
>      883:	75 09                	jne    88e <rcu_stutter_wait+0x34>
>      885:	83 3d 00 00 00 00 00 	cmpl   $0x0,0x0
>      88c:	75 09                	jne    897 <rcu_stutter_wait+0x3d>
>      88e:	83 3d 08 1a 00 00 00 	cmpl   $0x0,0x1a08
>      895:	74 c8                	je     85f <rcu_stutter_wait+0x5>
>      897:	5d                   	pop    %ebp
>      898:	c3                   	ret    
> 
> I confess, I am confused!!!

on the other box with a different gcc version

gcc version 4.3.2 (Ubuntu 4.3.2-1ubuntu11) 

d1902e90 is the start of rcu_stutter_wait

[  533.391719] d087e000 d1902e90
[  533.392294] rcu-torture:--- Start of test: nreaders=2 nfakewriters=4 stat_interval=0 verbose=0 test_no_idle_hz=0 shuffle_interval=3 stutter=5 irqreader=1
[  541.000139] BUG: unable to handle kernel paging request at d1902efd
[  541.000423] IP: [<d1902efd>] 0xd1902efd
[  541.000660] *pde = 0f08f067 *pte = 00000000 
[  541.000867] Oops: 0000 [#1] DEBUG_PAGEALLOC
[  541.001126] last sysfs file: /sys/block/sda/size
[  541.001246] Modules linked in: nfsd exportfs nfs lockd nfs_acl auth_rpcgss sunrpc ipv6 fuse unix [last unloaded: rcutorture]
[  541.002235] 
[  541.002334] Pid: 5292, comm: rcu_torture_wri Not tainted (2.6.28 #84) 
[  541.002470] EIP: 0060:[<d1902efd>] EFLAGS: 00010296 CPU: 0
[  541.002598] EIP is at 0xd1902efd
[  541.002767] EAX: 00000000 EBX: d19073c0 ECX: 00000000 EDX: 00000000
[  541.002900] ESI: 0000000a EDI: 00000000 EBP: c7b63fb8 ESP: c7b63fb8
[  541.003033]  DS: 007b ES: 007b FS: 0000 GS: 0000 SS: 0068
[  541.003160] Process rcu_torture_wri (pid: 5292, ti=c7b63000 task=c7b09710 task.ti=c7b63000)
[  541.003400] Stack:
[  541.003497]  c7b63fd0 d19032c1 00000000 00000000 00000000 d1903200 c7b63fe0 c013d80a
[  541.004022]  c013d7d0 00000000 00000000 c0103cf3 cef6ee70 00000000 00000000 00000000
[  541.004022]  00000201 000004b4
[  541.004022] Call Trace:
[  541.004022]  [<c013d80a>] ? kthread+0x3a/0x70
[  541.004022]  [<c013d7d0>] ? kthread+0x0/0x70
[  541.004022]  [<c0103cf3>] ? kernel_thread_helper+0x7/0x14
[  541.004022] Code:  Bad EIP value.
[  541.004022] EIP: [<d1902efd>] 0xd1902efd SS:ESP 0068:c7b63fb8
[  541.004022] ---[ end trace cb3b10c2bb94b4e3 ]---


00000e90 <rcu_stutter_wait>:
     e90:	55                   	push   %ebp
     e91:	89 e5                	mov    %esp,%ebp
     e93:	90                   	nop    
     e94:	8d 74 26 00          	lea    0x0(%esi,%eiz,1),%esi
     e98:	a1 98 00 00 00       	mov    0x98,%eax
     e9d:	85 c0                	test   %eax,%eax
     e9f:	75 09                	jne    eaa <rcu_stutter_wait+0x1a>
     ea1:	a1 00 00 00 00       	mov    0x0,%eax
     ea6:	85 c0                	test   %eax,%eax
     ea8:	75 36                	jne    ee0 <rcu_stutter_wait+0x50>
     eaa:	a1 88 1a 00 00       	mov    0x1a88,%eax
     eaf:	85 c0                	test   %eax,%eax
     eb1:	75 2d                	jne    ee0 <rcu_stutter_wait+0x50>
     eb3:	8b 15 00 00 00 00    	mov    0x0,%edx
     eb9:	85 d2                	test   %edx,%edx
     ebb:	74 2b                	je     ee8 <rcu_stutter_wait+0x58>
     ebd:	b8 01 00 00 00       	mov    $0x1,%eax
     ec2:	e8 fc ff ff ff       	call   ec3 <rcu_stutter_wait+0x33>
     ec7:	a1 98 00 00 00       	mov    0x98,%eax
     ecc:	85 c0                	test   %eax,%eax
     ece:	74 d1                	je     ea1 <rcu_stutter_wait+0x11>
     ed0:	a1 88 1a 00 00       	mov    0x1a88,%eax
     ed5:	85 c0                	test   %eax,%eax
     ed7:	74 da                	je     eb3 <rcu_stutter_wait+0x23>
     ed9:	8d b4 26 00 00 00 00 	lea    0x0(%esi,%eiz,1),%esi
     ee0:	5d                   	pop    %ebp
     ee1:	c3                   	ret    
     ee2:	8d b6 00 00 00 00    	lea    0x0(%esi),%esi
     ee8:	b8 fa 00 00 00       	mov    $0xfa,%eax
     eed:	e8 fc ff ff ff       	call   eee <rcu_stutter_wait+0x5e>
     ef2:	8d b6 00 00 00 00    	lea    0x0(%esi),%esi
     ef8:	e8 fc ff ff ff       	call   ef9 <rcu_stutter_wait+0x69>
     efd:	8d 76 00             	lea    0x0(%esi),%esi
			   ^------------- here

This one looks more like it can explain a page fault

     f00:	eb 96                	jmp    e98 <rcu_stutter_wait+0x8>
     f02:	8d b4 26 00 00 00 00 	lea    0x0(%esi,%eiz,1),%esi
     f09:	8d bc 27 00 00 00 00 	lea    0x0(%edi,%eiz,1),%edi

Greetings, Eric

^ permalink raw reply	[flat|nested] 27+ messages in thread

* Re: [BUG] NULL pointer deref with rcutorture
  2009-01-05 20:01                               ` Eric Sesterhenn
@ 2009-01-05 20:16                                 ` Paul E. McKenney
  2009-01-05 20:31                                   ` Eric Sesterhenn
  0 siblings, 1 reply; 27+ messages in thread
From: Paul E. McKenney @ 2009-01-05 20:16 UTC (permalink / raw)
  To: Eric Sesterhenn; +Cc: Kamalesh Babulal, linux-kernel, josh, dipankar

On Mon, Jan 05, 2009 at 09:01:45PM +0100, Eric Sesterhenn wrote:
> hi,
> 
> * Paul E. McKenney (paulmck@linux.vnet.ibm.com) wrote:
> > On Mon, Jan 05, 2009 at 07:56:55PM +0100, Eric Sesterhenn wrote:
> > > * Paul E. McKenney (paulmck@linux.vnet.ibm.com) wrote:
> > > [   65.135468] rcu_torture_cb: d0af7d1b rcu_bh_torture_wakeme_after_cb:
> > > d0af7bec
> > > [   65.135672] rcu-torture:--- Start of test: nreaders=2 nfakewriters=4
> > > stat_interval=0 verbose=0 test_no_idle_hz=0 shuffle_interval=3 stutter=5
> > > irqreader=1
> > > [   71.171603] BUG: unable to handle kernel NULL pointer dereference at
> > > (null)
> > > [   71.171954] IP: [<d0af7a0f>] 0xd0af7a0f
> > > [   71.192822] *pde = 00000000 
> > > [   71.196513] Oops: 0002 [#1] PREEMPT DEBUG_PAGEALLOC
> > > [   71.196826] last sysfs file: /sys/block/ram9/range
> > > [   71.197010] Modules linked in: [last unloaded: rcutorture]
> > > [   71.197010] 
> > > [   71.197010] Pid: 4861, comm: rcu_torture_wri Tainted: G        W
> > > (2.6.28-05716-gfe0bdec-dirty #171) System Name
> > > [   71.197010] EIP: 0060:[<d0af7a0f>] EFLAGS: 00010282 CPU: 0
> > > [   71.197010] EIP is at 0xd0af7a0f
> > > [   71.197010] EAX: 00000000 EBX: d0afbc20 ECX: c04f5cef EDX: c98abf7c
> > > [   71.197010] ESI: d0af7df0 EDI: 00000000 EBP: c98abfc4 ESP: c98abfc4
> > > [   71.197010]  DS: 007b ES: 007b FS: 0000 GS: 0000 SS: 0068
> > > [   71.197010] Process rcu_torture_wri (pid: 4861, ti=c98ab000
> > > task=c9890d00 task.ti=c98ab000)
> > > [   71.197010] Stack:
> > > [   71.197010]  c98abfd0 d0af7eeb 00000000 c98abfe0 c0137364 c0137326
> > > 00000000 00000000
> > > [   71.197010]  c0103643 c981fea4 00000000 00000000 00000000 00000000
> > > 00000000
> > > [   71.197010] Call Trace:
> > > [   71.197010]  [<c0137364>] ? kthread+0x3e/0x66
> > > [   71.197010]  [<c0137326>] ? kthread+0x0/0x66
> > > [   71.197010]  [<c0103643>] ? kernel_thread_helper+0x7/0x10
> > > [   71.197010] Code: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
> > > 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
> > > 00 00 <00> 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 
> > > [   71.197010] EIP: [<d0af7a0f>] 0xd0af7a0f SS:ESP 0068:c98abfc4
> > > [   71.301103] ---[ end trace 4eaa2a86a8e2da22 ]---
> > > 
> > > If i interpret this correctly, this corresponds to
> > > 
> > > 000009e8 <rcu_stutter_wait>:
> > >      9e8:       55                      push   %ebp
> > >      9e9:       89 e5                   mov    %esp,%ebp
> > >      9eb:       e8 fc ff ff ff          call   9ec <rcu_stutter_wait+0x4>
> > 
> > Wow!!!  Am I reading this correctly?  Does the above "call" instruction
> > -really- call one byte into itself?  That is what the hex for the x86
> > instruction -looks- like it is doing, but I cannot see what would have
> > possessed the compiler to generate this code.
> 
> Compiler is gcc version 4.2.4 (Ubuntu 4.2.4-1ubuntu3)

I am using 4.1.3, for whatever it is worth.  (Ancient, I know!)

> > When I compile on a 32-bit x86 machine, I don't see the above "call"
> > instruction.  Other than that, the code I see looks consistent.
> > 
> > >      9f0:       eb 1d                   jmp    a0f <rcu_stutter_wait+0x27>
> > >      9f2:       83 3d 00 00 00 00 00    cmpl   $0x0,0x0
> > >      9f9:       b8 01 00 00 00          mov    $0x1,%eax
> > >      9fe:       75 0a                   jne    a0a <rcu_stutter_wait+0x22>
> > >      a00:       b8 e8 03 00 00          mov    $0x3e8,%eax
> > >      a05:       e8 fc ff ff ff          call   a06 <rcu_stutter_wait+0x1e>
> > >      a0a:       e8 fc ff ff ff          call   a0b <rcu_stutter_wait+0x23>
> > >      a0f:       83 3d 6c 00 00 00 00    cmpl   $0x0,0x6c
> > > 			^---------- this line
> > 
> > This looks like the first test in the "while" loop.
> > 
> > >      a16:       75 09                   jne    a21 <rcu_stutter_wait+0x39>
> > >      a18:       83 3d 00 00 00 00 00    cmpl   $0x0,0x0
> > >      a1f:       75 09                   jne    a2a <rcu_stutter_wait+0x42>
> > >      a21:       83 3d 50 1a 00 00 00    cmpl   $0x0,0x1a50
> > >      a28:       74 c8                   je     9f2 <rcu_stutter_wait+0xa>
> > >      a2a:       5d                      pop    %ebp
> > >      a2b:       c3                      ret
> > 
> > The corresponding C code is as follows:
> > 
> > static void
> > rcu_stutter_wait(void)
> > {
> > 	while ((stutter_pause_test || !rcutorture_runnable) && !fullstop) {
> > 		if (rcutorture_runnable)
> > 			schedule_timeout_interruptible(1);
> > 		else
> > 			schedule_timeout_interruptible(round_jiffies_relative(HZ));
> > 	}
> > }
> > 
> > I don't see much opportunity for a page fault here...  This is the
> > binary I get when I compile it, though not as a module:
> > 
> > 0000085a <rcu_stutter_wait>:
> >      85a:	55                   	push   %ebp
> >      85b:	89 e5                	mov    %esp,%ebp
> >      85d:	eb 1d                	jmp    87c <rcu_stutter_wait+0x22>
> >      85f:	83 3d 00 00 00 00 00 	cmpl   $0x0,0x0
> >      866:	b8 01 00 00 00       	mov    $0x1,%eax
> >      86b:	75 0a                	jne    877 <rcu_stutter_wait+0x1d>
> >      86d:	b8 e8 03 00 00       	mov    $0x3e8,%eax
> >      872:	e8 fc ff ff ff       	call   873 <rcu_stutter_wait+0x19>
> >      877:	e8 fc ff ff ff       	call   878 <rcu_stutter_wait+0x1e>
> >      87c:	83 3d 14 00 00 00 00 	cmpl   $0x0,0x14
> >      883:	75 09                	jne    88e <rcu_stutter_wait+0x34>
> >      885:	83 3d 00 00 00 00 00 	cmpl   $0x0,0x0
> >      88c:	75 09                	jne    897 <rcu_stutter_wait+0x3d>
> >      88e:	83 3d 08 1a 00 00 00 	cmpl   $0x0,0x1a08
> >      895:	74 c8                	je     85f <rcu_stutter_wait+0x5>
> >      897:	5d                   	pop    %ebp
> >      898:	c3                   	ret    
> > 
> > I confess, I am confused!!!
> 
> on the other box with a different gcc version
> 
> gcc version 4.3.2 (Ubuntu 4.3.2-1ubuntu11) 
> 
> d1902e90 is the start of rcu_stutter_wait
> 
> [  533.391719] d087e000 d1902e90
> [  533.392294] rcu-torture:--- Start of test: nreaders=2 nfakewriters=4 stat_interval=0 verbose=0 test_no_idle_hz=0 shuffle_interval=3 stutter=5 irqreader=1
> [  541.000139] BUG: unable to handle kernel paging request at d1902efd
> [  541.000423] IP: [<d1902efd>] 0xd1902efd
> [  541.000660] *pde = 0f08f067 *pte = 00000000 
> [  541.000867] Oops: 0000 [#1] DEBUG_PAGEALLOC
> [  541.001126] last sysfs file: /sys/block/sda/size
> [  541.001246] Modules linked in: nfsd exportfs nfs lockd nfs_acl auth_rpcgss sunrpc ipv6 fuse unix [last unloaded: rcutorture]
> [  541.002235] 
> [  541.002334] Pid: 5292, comm: rcu_torture_wri Not tainted (2.6.28 #84) 
> [  541.002470] EIP: 0060:[<d1902efd>] EFLAGS: 00010296 CPU: 0
> [  541.002598] EIP is at 0xd1902efd
> [  541.002767] EAX: 00000000 EBX: d19073c0 ECX: 00000000 EDX: 00000000
> [  541.002900] ESI: 0000000a EDI: 00000000 EBP: c7b63fb8 ESP: c7b63fb8
> [  541.003033]  DS: 007b ES: 007b FS: 0000 GS: 0000 SS: 0068
> [  541.003160] Process rcu_torture_wri (pid: 5292, ti=c7b63000 task=c7b09710 task.ti=c7b63000)
> [  541.003400] Stack:
> [  541.003497]  c7b63fd0 d19032c1 00000000 00000000 00000000 d1903200 c7b63fe0 c013d80a
> [  541.004022]  c013d7d0 00000000 00000000 c0103cf3 cef6ee70 00000000 00000000 00000000
> [  541.004022]  00000201 000004b4
> [  541.004022] Call Trace:
> [  541.004022]  [<c013d80a>] ? kthread+0x3a/0x70
> [  541.004022]  [<c013d7d0>] ? kthread+0x0/0x70
> [  541.004022]  [<c0103cf3>] ? kernel_thread_helper+0x7/0x14
> [  541.004022] Code:  Bad EIP value.
> [  541.004022] EIP: [<d1902efd>] 0xd1902efd SS:ESP 0068:c7b63fb8
> [  541.004022] ---[ end trace cb3b10c2bb94b4e3 ]---
> 
> 
> 00000e90 <rcu_stutter_wait>:
>      e90:	55                   	push   %ebp
>      e91:	89 e5                	mov    %esp,%ebp
>      e93:	90                   	nop    
>      e94:	8d 74 26 00          	lea    0x0(%esi,%eiz,1),%esi
>      e98:	a1 98 00 00 00       	mov    0x98,%eax
>      e9d:	85 c0                	test   %eax,%eax
>      e9f:	75 09                	jne    eaa <rcu_stutter_wait+0x1a>
>      ea1:	a1 00 00 00 00       	mov    0x0,%eax
>      ea6:	85 c0                	test   %eax,%eax
>      ea8:	75 36                	jne    ee0 <rcu_stutter_wait+0x50>
>      eaa:	a1 88 1a 00 00       	mov    0x1a88,%eax
>      eaf:	85 c0                	test   %eax,%eax
>      eb1:	75 2d                	jne    ee0 <rcu_stutter_wait+0x50>
>      eb3:	8b 15 00 00 00 00    	mov    0x0,%edx
>      eb9:	85 d2                	test   %edx,%edx
>      ebb:	74 2b                	je     ee8 <rcu_stutter_wait+0x58>
>      ebd:	b8 01 00 00 00       	mov    $0x1,%eax
>      ec2:	e8 fc ff ff ff       	call   ec3 <rcu_stutter_wait+0x33>
>      ec7:	a1 98 00 00 00       	mov    0x98,%eax
>      ecc:	85 c0                	test   %eax,%eax
>      ece:	74 d1                	je     ea1 <rcu_stutter_wait+0x11>
>      ed0:	a1 88 1a 00 00       	mov    0x1a88,%eax
>      ed5:	85 c0                	test   %eax,%eax
>      ed7:	74 da                	je     eb3 <rcu_stutter_wait+0x23>
>      ed9:	8d b4 26 00 00 00 00 	lea    0x0(%esi,%eiz,1),%esi
>      ee0:	5d                   	pop    %ebp
>      ee1:	c3                   	ret    
>      ee2:	8d b6 00 00 00 00    	lea    0x0(%esi),%esi
>      ee8:	b8 fa 00 00 00       	mov    $0xfa,%eax
>      eed:	e8 fc ff ff ff       	call   eee <rcu_stutter_wait+0x5e>

Here we are again calling one byte into the current instruction!!!

Or am I misinterpreting this code?

>      ef2:	8d b6 00 00 00 00    	lea    0x0(%esi),%esi
>      ef8:	e8 fc ff ff ff       	call   ef9 <rcu_stutter_wait+0x69>
>      efd:	8d 76 00             	lea    0x0(%esi),%esi
> 			   ^------------- here
> 
> This one looks more like it can explain a page fault

I don't understand why there are indirections in the assembly given the
C code for rcu_stutter_wait().

>      f00:	eb 96                	jmp    e98 <rcu_stutter_wait+0x8>
>      f02:	8d b4 26 00 00 00 00 	lea    0x0(%esi,%eiz,1),%esi
>      f09:	8d bc 27 00 00 00 00 	lea    0x0(%edi,%eiz,1),%edi
> 
> Greetings, Eric

							Thanx, Paul

^ permalink raw reply	[flat|nested] 27+ messages in thread

* Re: [BUG] NULL pointer deref with rcutorture
  2009-01-05 20:16                                 ` Paul E. McKenney
@ 2009-01-05 20:31                                   ` Eric Sesterhenn
  2009-01-05 22:18                                     ` Paul E. McKenney
  0 siblings, 1 reply; 27+ messages in thread
From: Eric Sesterhenn @ 2009-01-05 20:31 UTC (permalink / raw)
  To: Paul E. McKenney; +Cc: Kamalesh Babulal, linux-kernel, josh, dipankar

* Paul E. McKenney (paulmck@linux.vnet.ibm.com) wrote:
> On Mon, Jan 05, 2009 at 09:01:45PM +0100, Eric Sesterhenn wrote:
> > hi,
> > 
> > * Paul E. McKenney (paulmck@linux.vnet.ibm.com) wrote:
> > > On Mon, Jan 05, 2009 at 07:56:55PM +0100, Eric Sesterhenn wrote:
> > > 
> > > Wow!!!  Am I reading this correctly?  Does the above "call" instruction
> > > -really- call one byte into itself?  That is what the hex for the x86
> > > instruction -looks- like it is doing, but I cannot see what would have
> > > possessed the compiler to generate this code.
> > 
> > Compiler is gcc version 4.2.4 (Ubuntu 4.2.4-1ubuntu3)
> 
> I am using 4.1.3, for whatever it is worth.  (Ancient, I know!)
> 
> > > When I compile on a 32-bit x86 machine, I don't see the above "call"
> > > instruction.  Other than that, the code I see looks consistent.
> > > 
> > > >      9f0:       eb 1d                   jmp    a0f <rcu_stutter_wait+0x27>
> > > >      9f2:       83 3d 00 00 00 00 00    cmpl   $0x0,0x0
> > > >      9f9:       b8 01 00 00 00          mov    $0x1,%eax
> > > >      9fe:       75 0a                   jne    a0a <rcu_stutter_wait+0x22>
> > > >      a00:       b8 e8 03 00 00          mov    $0x3e8,%eax
> > > >      a05:       e8 fc ff ff ff          call   a06 <rcu_stutter_wait+0x1e>
> > > >      a0a:       e8 fc ff ff ff          call   a0b <rcu_stutter_wait+0x23>
> > > >      a0f:       83 3d 6c 00 00 00 00    cmpl   $0x0,0x6c
> > > > 			^---------- this line
> > > 
> > > This looks like the first test in the "while" loop.
> > > 
> > > >      a16:       75 09                   jne    a21 <rcu_stutter_wait+0x39>
> > > >      a18:       83 3d 00 00 00 00 00    cmpl   $0x0,0x0
> > > >      a1f:       75 09                   jne    a2a <rcu_stutter_wait+0x42>
> > > >      a21:       83 3d 50 1a 00 00 00    cmpl   $0x0,0x1a50
> > > >      a28:       74 c8                   je     9f2 <rcu_stutter_wait+0xa>
> > > >      a2a:       5d                      pop    %ebp
> > > >      a2b:       c3                      ret
> > > 
> > > The corresponding C code is as follows:
> > > 
> > > static void
> > > rcu_stutter_wait(void)
> > > {
> > > 	while ((stutter_pause_test || !rcutorture_runnable) && !fullstop) {
> > > 		if (rcutorture_runnable)
> > > 			schedule_timeout_interruptible(1);
> > > 		else
> > > 			schedule_timeout_interruptible(round_jiffies_relative(HZ));
> > > 	}
> > > }
> > > 
> > > I don't see much opportunity for a page fault here...  This is the
> > > binary I get when I compile it, though not as a module:
> > > 
> > > 0000085a <rcu_stutter_wait>:
> > >      85a:	55                   	push   %ebp
> > >      85b:	89 e5                	mov    %esp,%ebp
> > >      85d:	eb 1d                	jmp    87c <rcu_stutter_wait+0x22>
> > >      85f:	83 3d 00 00 00 00 00 	cmpl   $0x0,0x0
> > >      866:	b8 01 00 00 00       	mov    $0x1,%eax
> > >      86b:	75 0a                	jne    877 <rcu_stutter_wait+0x1d>
> > >      86d:	b8 e8 03 00 00       	mov    $0x3e8,%eax
> > >      872:	e8 fc ff ff ff       	call   873 <rcu_stutter_wait+0x19>
> > >      877:	e8 fc ff ff ff       	call   878 <rcu_stutter_wait+0x1e>
> > >      87c:	83 3d 14 00 00 00 00 	cmpl   $0x0,0x14
> > >      883:	75 09                	jne    88e <rcu_stutter_wait+0x34>
> > >      885:	83 3d 00 00 00 00 00 	cmpl   $0x0,0x0
> > >      88c:	75 09                	jne    897 <rcu_stutter_wait+0x3d>
> > >      88e:	83 3d 08 1a 00 00 00 	cmpl   $0x0,0x1a08
> > >      895:	74 c8                	je     85f <rcu_stutter_wait+0x5>
> > >      897:	5d                   	pop    %ebp
> > >      898:	c3                   	ret    
> > > 
> > > I confess, I am confused!!!
> > 
> > on the other box with a different gcc version
> > 
> > gcc version 4.3.2 (Ubuntu 4.3.2-1ubuntu11) 
> > 
> > d1902e90 is the start of rcu_stutter_wait
> > 
> > [  533.391719] d087e000 d1902e90
> > [  533.392294] rcu-torture:--- Start of test: nreaders=2 nfakewriters=4 stat_interval=0 verbose=0 test_no_idle_hz=0 shuffle_interval=3 stutter=5 irqreader=1
> > [  541.000139] BUG: unable to handle kernel paging request at d1902efd
> > [  541.000423] IP: [<d1902efd>] 0xd1902efd
> > [  541.000660] *pde = 0f08f067 *pte = 00000000 
> > [  541.000867] Oops: 0000 [#1] DEBUG_PAGEALLOC
> > [  541.001126] last sysfs file: /sys/block/sda/size
> > [  541.001246] Modules linked in: nfsd exportfs nfs lockd nfs_acl auth_rpcgss sunrpc ipv6 fuse unix [last unloaded: rcutorture]
> > [  541.002235] 
> > [  541.002334] Pid: 5292, comm: rcu_torture_wri Not tainted (2.6.28 #84) 
> > [  541.002470] EIP: 0060:[<d1902efd>] EFLAGS: 00010296 CPU: 0
> > [  541.002598] EIP is at 0xd1902efd
> > [  541.002767] EAX: 00000000 EBX: d19073c0 ECX: 00000000 EDX: 00000000
> > [  541.002900] ESI: 0000000a EDI: 00000000 EBP: c7b63fb8 ESP: c7b63fb8
> > [  541.003033]  DS: 007b ES: 007b FS: 0000 GS: 0000 SS: 0068
> > [  541.003160] Process rcu_torture_wri (pid: 5292, ti=c7b63000 task=c7b09710 task.ti=c7b63000)
> > [  541.003400] Stack:
> > [  541.003497]  c7b63fd0 d19032c1 00000000 00000000 00000000 d1903200 c7b63fe0 c013d80a
> > [  541.004022]  c013d7d0 00000000 00000000 c0103cf3 cef6ee70 00000000 00000000 00000000
> > [  541.004022]  00000201 000004b4
> > [  541.004022] Call Trace:
> > [  541.004022]  [<c013d80a>] ? kthread+0x3a/0x70
> > [  541.004022]  [<c013d7d0>] ? kthread+0x0/0x70
> > [  541.004022]  [<c0103cf3>] ? kernel_thread_helper+0x7/0x14
> > [  541.004022] Code:  Bad EIP value.
> > [  541.004022] EIP: [<d1902efd>] 0xd1902efd SS:ESP 0068:c7b63fb8
> > [  541.004022] ---[ end trace cb3b10c2bb94b4e3 ]---
> > 
> > 
> > 00000e90 <rcu_stutter_wait>:
> >      e90:	55                   	push   %ebp
> >      e91:	89 e5                	mov    %esp,%ebp
> >      e93:	90                   	nop    
> >      e94:	8d 74 26 00          	lea    0x0(%esi,%eiz,1),%esi
> >      e98:	a1 98 00 00 00       	mov    0x98,%eax
> >      e9d:	85 c0                	test   %eax,%eax
> >      e9f:	75 09                	jne    eaa <rcu_stutter_wait+0x1a>
> >      ea1:	a1 00 00 00 00       	mov    0x0,%eax
> >      ea6:	85 c0                	test   %eax,%eax
> >      ea8:	75 36                	jne    ee0 <rcu_stutter_wait+0x50>
> >      eaa:	a1 88 1a 00 00       	mov    0x1a88,%eax
> >      eaf:	85 c0                	test   %eax,%eax
> >      eb1:	75 2d                	jne    ee0 <rcu_stutter_wait+0x50>
> >      eb3:	8b 15 00 00 00 00    	mov    0x0,%edx
> >      eb9:	85 d2                	test   %edx,%edx
> >      ebb:	74 2b                	je     ee8 <rcu_stutter_wait+0x58>
> >      ebd:	b8 01 00 00 00       	mov    $0x1,%eax
> >      ec2:	e8 fc ff ff ff       	call   ec3 <rcu_stutter_wait+0x33>
> >      ec7:	a1 98 00 00 00       	mov    0x98,%eax
> >      ecc:	85 c0                	test   %eax,%eax
> >      ece:	74 d1                	je     ea1 <rcu_stutter_wait+0x11>
> >      ed0:	a1 88 1a 00 00       	mov    0x1a88,%eax
> >      ed5:	85 c0                	test   %eax,%eax
> >      ed7:	74 da                	je     eb3 <rcu_stutter_wait+0x23>
> >      ed9:	8d b4 26 00 00 00 00 	lea    0x0(%esi,%eiz,1),%esi
> >      ee0:	5d                   	pop    %ebp
> >      ee1:	c3                   	ret    
> >      ee2:	8d b6 00 00 00 00    	lea    0x0(%esi),%esi
> >      ee8:	b8 fa 00 00 00       	mov    $0xfa,%eax
> >      eed:	e8 fc ff ff ff       	call   eee <rcu_stutter_wait+0x5e>
> 
> Here we are again calling one byte into the current instruction!!!
> 
> Or am I misinterpreting this code?
> 
> >      ef2:	8d b6 00 00 00 00    	lea    0x0(%esi),%esi
> >      ef8:	e8 fc ff ff ff       	call   ef9 <rcu_stutter_wait+0x69>
> >      efd:	8d 76 00             	lea    0x0(%esi),%esi
> > 			   ^------------- here
> > 
> > This one looks more like it can explain a page fault
> 
> I don't understand why there are indirections in the assembly given the
> C code for rcu_stutter_wait().
> 
> >      f00:	eb 96                	jmp    e98 <rcu_stutter_wait+0x8>
> >      f02:	8d b4 26 00 00 00 00 	lea    0x0(%esi,%eiz,1),%esi
> >      f09:	8d bc 27 00 00 00 00 	lea    0x0(%edi,%eiz,1),%edi

ok, after trying to find out if the ubuntu gccs are broken, i stumbled
upon this:
http://forum.soft32.com/linux/Strange-problem-disassembling-shared-lib-ftopict439936.html

Seems the difference is that you dont compile it as a module and the
jump is perfectly normal, it gets overwritten when the stuff is loaded
objdump -dr gives me

00000e90 <rcu_stutter_wait>:
     e90:	55                   	push   %ebp
     e91:	89 e5                	mov    %esp,%ebp
     e93:	90                   	nop    
     e94:	8d 74 26 00          	lea    0x0(%esi,%eiz,1),%esi
     e98:	a1 98 00 00 00       	mov    0x98,%eax
			e99: R_386_32	.bss
     e9d:	85 c0                	test   %eax,%eax
     e9f:	75 09                	jne    eaa <rcu_stutter_wait+0x1a>
     ea1:	a1 00 00 00 00       	mov    0x0,%eax
			ea2: R_386_32	rcutorture_runnable
     ea6:	85 c0                	test   %eax,%eax
     ea8:	75 36                	jne    ee0 <rcu_stutter_wait+0x50>
     eaa:	a1 88 1a 00 00       	mov    0x1a88,%eax
			eab: R_386_32	.bss
     eaf:	85 c0                	test   %eax,%eax
     eb1:	75 2d                	jne    ee0 <rcu_stutter_wait+0x50>
     eb3:	8b 15 00 00 00 00    	mov    0x0,%edx
			eb5: R_386_32	rcutorture_runnable
     eb9:	85 d2                	test   %edx,%edx
     ebb:	74 2b                	je     ee8 <rcu_stutter_wait+0x58>
     ebd:	b8 01 00 00 00       	mov    $0x1,%eax
     ec2:	e8 fc ff ff ff       	call   ec3 <rcu_stutter_wait+0x33>
			ec3: R_386_PC32	schedule_timeout_interruptible
     ec7:	a1 98 00 00 00       	mov    0x98,%eax
			ec8: R_386_32	.bss
     ecc:	85 c0                	test   %eax,%eax
     ece:	74 d1                	je     ea1 <rcu_stutter_wait+0x11>
     ed0:	a1 88 1a 00 00       	mov    0x1a88,%eax
			ed1: R_386_32	.bss
     ed5:	85 c0                	test   %eax,%eax
     ed7:	74 da                	je     eb3 <rcu_stutter_wait+0x23>
     ed9:	8d b4 26 00 00 00 00 	lea    0x0(%esi,%eiz,1),%esi
     ee0:	5d                   	pop    %ebp
     ee1:	c3                   	ret    
     ee2:	8d b6 00 00 00 00    	lea    0x0(%esi),%esi
     ee8:	b8 fa 00 00 00       	mov    $0xfa,%eax
     eed:	e8 fc ff ff ff       	call   eee <rcu_stutter_wait+0x5e>
			eee: R_386_PC32	round_jiffies_relative
     ef2:	8d b6 00 00 00 00    	lea    0x0(%esi),%esi
     ef8:	e8 fc ff ff ff       	call   ef9 <rcu_stutter_wait+0x69>
			ef9: R_386_PC32	schedule_timeout_interruptible
     efd:	8d 76 00             	lea    0x0(%esi),%esi

here is the deref ------------------------^

     f00:	eb 96                	jmp    e98 <rcu_stutter_wait+0x8>
     f02:	8d b4 26 00 00 00 00 	lea    0x0(%esi,%eiz,1),%esi
     f09:	8d bc 27 00 00 00 00 	lea    0x0(%edi,%eiz,1),%edi


^ permalink raw reply	[flat|nested] 27+ messages in thread

* Re: [BUG] NULL pointer deref with rcutorture
  2009-01-05 20:31                                   ` Eric Sesterhenn
@ 2009-01-05 22:18                                     ` Paul E. McKenney
  2009-01-06  0:29                                       ` Paul E. McKenney
  0 siblings, 1 reply; 27+ messages in thread
From: Paul E. McKenney @ 2009-01-05 22:18 UTC (permalink / raw)
  To: Eric Sesterhenn; +Cc: Kamalesh Babulal, linux-kernel, josh, dipankar

On Mon, Jan 05, 2009 at 09:31:53PM +0100, Eric Sesterhenn wrote:
> * Paul E. McKenney (paulmck@linux.vnet.ibm.com) wrote:
> > On Mon, Jan 05, 2009 at 09:01:45PM +0100, Eric Sesterhenn wrote:
> > > hi,
> > > 
> > > * Paul E. McKenney (paulmck@linux.vnet.ibm.com) wrote:
> > > > On Mon, Jan 05, 2009 at 07:56:55PM +0100, Eric Sesterhenn wrote:
> > > > 
> > > > Wow!!!  Am I reading this correctly?  Does the above "call" instruction
> > > > -really- call one byte into itself?  That is what the hex for the x86
> > > > instruction -looks- like it is doing, but I cannot see what would have
> > > > possessed the compiler to generate this code.
> > > 
> > > Compiler is gcc version 4.2.4 (Ubuntu 4.2.4-1ubuntu3)
> > 
> > I am using 4.1.3, for whatever it is worth.  (Ancient, I know!)
> > 
> > > > When I compile on a 32-bit x86 machine, I don't see the above "call"
> > > > instruction.  Other than that, the code I see looks consistent.
> > > > 
> > > > >      9f0:       eb 1d                   jmp    a0f <rcu_stutter_wait+0x27>
> > > > >      9f2:       83 3d 00 00 00 00 00    cmpl   $0x0,0x0
> > > > >      9f9:       b8 01 00 00 00          mov    $0x1,%eax
> > > > >      9fe:       75 0a                   jne    a0a <rcu_stutter_wait+0x22>
> > > > >      a00:       b8 e8 03 00 00          mov    $0x3e8,%eax
> > > > >      a05:       e8 fc ff ff ff          call   a06 <rcu_stutter_wait+0x1e>
> > > > >      a0a:       e8 fc ff ff ff          call   a0b <rcu_stutter_wait+0x23>
> > > > >      a0f:       83 3d 6c 00 00 00 00    cmpl   $0x0,0x6c
> > > > > 			^---------- this line
> > > > 
> > > > This looks like the first test in the "while" loop.
> > > > 
> > > > >      a16:       75 09                   jne    a21 <rcu_stutter_wait+0x39>
> > > > >      a18:       83 3d 00 00 00 00 00    cmpl   $0x0,0x0
> > > > >      a1f:       75 09                   jne    a2a <rcu_stutter_wait+0x42>
> > > > >      a21:       83 3d 50 1a 00 00 00    cmpl   $0x0,0x1a50
> > > > >      a28:       74 c8                   je     9f2 <rcu_stutter_wait+0xa>
> > > > >      a2a:       5d                      pop    %ebp
> > > > >      a2b:       c3                      ret
> > > > 
> > > > The corresponding C code is as follows:
> > > > 
> > > > static void
> > > > rcu_stutter_wait(void)
> > > > {
> > > > 	while ((stutter_pause_test || !rcutorture_runnable) && !fullstop) {
> > > > 		if (rcutorture_runnable)
> > > > 			schedule_timeout_interruptible(1);
> > > > 		else
> > > > 			schedule_timeout_interruptible(round_jiffies_relative(HZ));
> > > > 	}
> > > > }
> > > > 
> > > > I don't see much opportunity for a page fault here...  This is the
> > > > binary I get when I compile it, though not as a module:
> > > > 
> > > > 0000085a <rcu_stutter_wait>:
> > > >      85a:	55                   	push   %ebp
> > > >      85b:	89 e5                	mov    %esp,%ebp
> > > >      85d:	eb 1d                	jmp    87c <rcu_stutter_wait+0x22>
> > > >      85f:	83 3d 00 00 00 00 00 	cmpl   $0x0,0x0
> > > >      866:	b8 01 00 00 00       	mov    $0x1,%eax
> > > >      86b:	75 0a                	jne    877 <rcu_stutter_wait+0x1d>
> > > >      86d:	b8 e8 03 00 00       	mov    $0x3e8,%eax
> > > >      872:	e8 fc ff ff ff       	call   873 <rcu_stutter_wait+0x19>
> > > >      877:	e8 fc ff ff ff       	call   878 <rcu_stutter_wait+0x1e>
> > > >      87c:	83 3d 14 00 00 00 00 	cmpl   $0x0,0x14
> > > >      883:	75 09                	jne    88e <rcu_stutter_wait+0x34>
> > > >      885:	83 3d 00 00 00 00 00 	cmpl   $0x0,0x0
> > > >      88c:	75 09                	jne    897 <rcu_stutter_wait+0x3d>
> > > >      88e:	83 3d 08 1a 00 00 00 	cmpl   $0x0,0x1a08
> > > >      895:	74 c8                	je     85f <rcu_stutter_wait+0x5>
> > > >      897:	5d                   	pop    %ebp
> > > >      898:	c3                   	ret    
> > > > 
> > > > I confess, I am confused!!!
> > > 
> > > on the other box with a different gcc version
> > > 
> > > gcc version 4.3.2 (Ubuntu 4.3.2-1ubuntu11) 
> > > 
> > > d1902e90 is the start of rcu_stutter_wait
> > > 
> > > [  533.391719] d087e000 d1902e90
> > > [  533.392294] rcu-torture:--- Start of test: nreaders=2 nfakewriters=4 stat_interval=0 verbose=0 test_no_idle_hz=0 shuffle_interval=3 stutter=5 irqreader=1
> > > [  541.000139] BUG: unable to handle kernel paging request at d1902efd
> > > [  541.000423] IP: [<d1902efd>] 0xd1902efd
> > > [  541.000660] *pde = 0f08f067 *pte = 00000000 
> > > [  541.000867] Oops: 0000 [#1] DEBUG_PAGEALLOC
> > > [  541.001126] last sysfs file: /sys/block/sda/size
> > > [  541.001246] Modules linked in: nfsd exportfs nfs lockd nfs_acl auth_rpcgss sunrpc ipv6 fuse unix [last unloaded: rcutorture]
> > > [  541.002235] 
> > > [  541.002334] Pid: 5292, comm: rcu_torture_wri Not tainted (2.6.28 #84) 
> > > [  541.002470] EIP: 0060:[<d1902efd>] EFLAGS: 00010296 CPU: 0
> > > [  541.002598] EIP is at 0xd1902efd
> > > [  541.002767] EAX: 00000000 EBX: d19073c0 ECX: 00000000 EDX: 00000000
> > > [  541.002900] ESI: 0000000a EDI: 00000000 EBP: c7b63fb8 ESP: c7b63fb8
> > > [  541.003033]  DS: 007b ES: 007b FS: 0000 GS: 0000 SS: 0068
> > > [  541.003160] Process rcu_torture_wri (pid: 5292, ti=c7b63000 task=c7b09710 task.ti=c7b63000)
> > > [  541.003400] Stack:
> > > [  541.003497]  c7b63fd0 d19032c1 00000000 00000000 00000000 d1903200 c7b63fe0 c013d80a
> > > [  541.004022]  c013d7d0 00000000 00000000 c0103cf3 cef6ee70 00000000 00000000 00000000
> > > [  541.004022]  00000201 000004b4
> > > [  541.004022] Call Trace:
> > > [  541.004022]  [<c013d80a>] ? kthread+0x3a/0x70
> > > [  541.004022]  [<c013d7d0>] ? kthread+0x0/0x70
> > > [  541.004022]  [<c0103cf3>] ? kernel_thread_helper+0x7/0x14
> > > [  541.004022] Code:  Bad EIP value.
> > > [  541.004022] EIP: [<d1902efd>] 0xd1902efd SS:ESP 0068:c7b63fb8
> > > [  541.004022] ---[ end trace cb3b10c2bb94b4e3 ]---
> > > 
> > > 
> > > 00000e90 <rcu_stutter_wait>:
> > >      e90:	55                   	push   %ebp
> > >      e91:	89 e5                	mov    %esp,%ebp
> > >      e93:	90                   	nop    
> > >      e94:	8d 74 26 00          	lea    0x0(%esi,%eiz,1),%esi
> > >      e98:	a1 98 00 00 00       	mov    0x98,%eax
> > >      e9d:	85 c0                	test   %eax,%eax
> > >      e9f:	75 09                	jne    eaa <rcu_stutter_wait+0x1a>
> > >      ea1:	a1 00 00 00 00       	mov    0x0,%eax
> > >      ea6:	85 c0                	test   %eax,%eax
> > >      ea8:	75 36                	jne    ee0 <rcu_stutter_wait+0x50>
> > >      eaa:	a1 88 1a 00 00       	mov    0x1a88,%eax
> > >      eaf:	85 c0                	test   %eax,%eax
> > >      eb1:	75 2d                	jne    ee0 <rcu_stutter_wait+0x50>
> > >      eb3:	8b 15 00 00 00 00    	mov    0x0,%edx
> > >      eb9:	85 d2                	test   %edx,%edx
> > >      ebb:	74 2b                	je     ee8 <rcu_stutter_wait+0x58>
> > >      ebd:	b8 01 00 00 00       	mov    $0x1,%eax
> > >      ec2:	e8 fc ff ff ff       	call   ec3 <rcu_stutter_wait+0x33>
> > >      ec7:	a1 98 00 00 00       	mov    0x98,%eax
> > >      ecc:	85 c0                	test   %eax,%eax
> > >      ece:	74 d1                	je     ea1 <rcu_stutter_wait+0x11>
> > >      ed0:	a1 88 1a 00 00       	mov    0x1a88,%eax
> > >      ed5:	85 c0                	test   %eax,%eax
> > >      ed7:	74 da                	je     eb3 <rcu_stutter_wait+0x23>
> > >      ed9:	8d b4 26 00 00 00 00 	lea    0x0(%esi,%eiz,1),%esi
> > >      ee0:	5d                   	pop    %ebp
> > >      ee1:	c3                   	ret    
> > >      ee2:	8d b6 00 00 00 00    	lea    0x0(%esi),%esi
> > >      ee8:	b8 fa 00 00 00       	mov    $0xfa,%eax
> > >      eed:	e8 fc ff ff ff       	call   eee <rcu_stutter_wait+0x5e>
> > 
> > Here we are again calling one byte into the current instruction!!!
> > 
> > Or am I misinterpreting this code?
> > 
> > >      ef2:	8d b6 00 00 00 00    	lea    0x0(%esi),%esi
> > >      ef8:	e8 fc ff ff ff       	call   ef9 <rcu_stutter_wait+0x69>
> > >      efd:	8d 76 00             	lea    0x0(%esi),%esi
> > > 			   ^------------- here
> > > 
> > > This one looks more like it can explain a page fault
> > 
> > I don't understand why there are indirections in the assembly given the
> > C code for rcu_stutter_wait().
> > 
> > >      f00:	eb 96                	jmp    e98 <rcu_stutter_wait+0x8>
> > >      f02:	8d b4 26 00 00 00 00 	lea    0x0(%esi,%eiz,1),%esi
> > >      f09:	8d bc 27 00 00 00 00 	lea    0x0(%edi,%eiz,1),%edi
> 
> ok, after trying to find out if the ubuntu gccs are broken, i stumbled
> upon this:
> http://forum.soft32.com/linux/Strange-problem-disassembling-shared-lib-ftopict439936.html
> 
> Seems the difference is that you dont compile it as a module and the
> jump is perfectly normal, it gets overwritten when the stuff is loaded
> objdump -dr gives me
> 
> 00000e90 <rcu_stutter_wait>:
>      e90:	55                   	push   %ebp
>      e91:	89 e5                	mov    %esp,%ebp
>      e93:	90                   	nop    
>      e94:	8d 74 26 00          	lea    0x0(%esi,%eiz,1),%esi
>      e98:	a1 98 00 00 00       	mov    0x98,%eax
> 			e99: R_386_32	.bss
>      e9d:	85 c0                	test   %eax,%eax
>      e9f:	75 09                	jne    eaa <rcu_stutter_wait+0x1a>
>      ea1:	a1 00 00 00 00       	mov    0x0,%eax
> 			ea2: R_386_32	rcutorture_runnable
>      ea6:	85 c0                	test   %eax,%eax
>      ea8:	75 36                	jne    ee0 <rcu_stutter_wait+0x50>
>      eaa:	a1 88 1a 00 00       	mov    0x1a88,%eax
> 			eab: R_386_32	.bss
>      eaf:	85 c0                	test   %eax,%eax
>      eb1:	75 2d                	jne    ee0 <rcu_stutter_wait+0x50>
>      eb3:	8b 15 00 00 00 00    	mov    0x0,%edx
> 			eb5: R_386_32	rcutorture_runnable
>      eb9:	85 d2                	test   %edx,%edx
>      ebb:	74 2b                	je     ee8 <rcu_stutter_wait+0x58>
>      ebd:	b8 01 00 00 00       	mov    $0x1,%eax
>      ec2:	e8 fc ff ff ff       	call   ec3 <rcu_stutter_wait+0x33>
> 			ec3: R_386_PC32	schedule_timeout_interruptible
>      ec7:	a1 98 00 00 00       	mov    0x98,%eax
> 			ec8: R_386_32	.bss
>      ecc:	85 c0                	test   %eax,%eax
>      ece:	74 d1                	je     ea1 <rcu_stutter_wait+0x11>
>      ed0:	a1 88 1a 00 00       	mov    0x1a88,%eax
> 			ed1: R_386_32	.bss
>      ed5:	85 c0                	test   %eax,%eax
>      ed7:	74 da                	je     eb3 <rcu_stutter_wait+0x23>
>      ed9:	8d b4 26 00 00 00 00 	lea    0x0(%esi,%eiz,1),%esi
>      ee0:	5d                   	pop    %ebp
>      ee1:	c3                   	ret    
>      ee2:	8d b6 00 00 00 00    	lea    0x0(%esi),%esi
>      ee8:	b8 fa 00 00 00       	mov    $0xfa,%eax
>      eed:	e8 fc ff ff ff       	call   eee <rcu_stutter_wait+0x5e>
> 			eee: R_386_PC32	round_jiffies_relative
>      ef2:	8d b6 00 00 00 00    	lea    0x0(%esi),%esi
>      ef8:	e8 fc ff ff ff       	call   ef9 <rcu_stutter_wait+0x69>
> 			ef9: R_386_PC32	schedule_timeout_interruptible
>      efd:	8d 76 00             	lea    0x0(%esi),%esi
> 
> here is the deref ------------------------^

Ah!!!  We are getting a page fault while cleaning up the stack frame?

Ouch!

>      f00:	eb 96                	jmp    e98 <rcu_stutter_wait+0x8>
>      f02:	8d b4 26 00 00 00 00 	lea    0x0(%esi,%eiz,1),%esi
>      f09:	8d bc 27 00 00 00 00 	lea    0x0(%edi,%eiz,1),%edi

							Thanx, Paul

^ permalink raw reply	[flat|nested] 27+ messages in thread

* Re: [BUG] NULL pointer deref with rcutorture
  2009-01-05 22:18                                     ` Paul E. McKenney
@ 2009-01-06  0:29                                       ` Paul E. McKenney
  2009-01-06  2:15                                         ` Paul E. McKenney
  0 siblings, 1 reply; 27+ messages in thread
From: Paul E. McKenney @ 2009-01-06  0:29 UTC (permalink / raw)
  To: Eric Sesterhenn; +Cc: Kamalesh Babulal, linux-kernel, josh, dipankar

On Mon, Jan 05, 2009 at 02:18:36PM -0800, Paul E. McKenney wrote:
> On Mon, Jan 05, 2009 at 09:31:53PM +0100, Eric Sesterhenn wrote:
> > * Paul E. McKenney (paulmck@linux.vnet.ibm.com) wrote:
> > > On Mon, Jan 05, 2009 at 09:01:45PM +0100, Eric Sesterhenn wrote:
> > > > hi,
> > > > 
> > > > * Paul E. McKenney (paulmck@linux.vnet.ibm.com) wrote:
> > > > > On Mon, Jan 05, 2009 at 07:56:55PM +0100, Eric Sesterhenn wrote:
> > > > > 
> > > > > Wow!!!  Am I reading this correctly?  Does the above "call" instruction
> > > > > -really- call one byte into itself?  That is what the hex for the x86
> > > > > instruction -looks- like it is doing, but I cannot see what would have
> > > > > possessed the compiler to generate this code.
> > > > 
> > > > Compiler is gcc version 4.2.4 (Ubuntu 4.2.4-1ubuntu3)
> > > 
> > > I am using 4.1.3, for whatever it is worth.  (Ancient, I know!)
> > > 
> > > > > When I compile on a 32-bit x86 machine, I don't see the above "call"
> > > > > instruction.  Other than that, the code I see looks consistent.
> > > > > 
> > > > > >      9f0:       eb 1d                   jmp    a0f <rcu_stutter_wait+0x27>
> > > > > >      9f2:       83 3d 00 00 00 00 00    cmpl   $0x0,0x0
> > > > > >      9f9:       b8 01 00 00 00          mov    $0x1,%eax
> > > > > >      9fe:       75 0a                   jne    a0a <rcu_stutter_wait+0x22>
> > > > > >      a00:       b8 e8 03 00 00          mov    $0x3e8,%eax
> > > > > >      a05:       e8 fc ff ff ff          call   a06 <rcu_stutter_wait+0x1e>
> > > > > >      a0a:       e8 fc ff ff ff          call   a0b <rcu_stutter_wait+0x23>
> > > > > >      a0f:       83 3d 6c 00 00 00 00    cmpl   $0x0,0x6c
> > > > > > 			^---------- this line
> > > > > 
> > > > > This looks like the first test in the "while" loop.
> > > > > 
> > > > > >      a16:       75 09                   jne    a21 <rcu_stutter_wait+0x39>
> > > > > >      a18:       83 3d 00 00 00 00 00    cmpl   $0x0,0x0
> > > > > >      a1f:       75 09                   jne    a2a <rcu_stutter_wait+0x42>
> > > > > >      a21:       83 3d 50 1a 00 00 00    cmpl   $0x0,0x1a50
> > > > > >      a28:       74 c8                   je     9f2 <rcu_stutter_wait+0xa>
> > > > > >      a2a:       5d                      pop    %ebp
> > > > > >      a2b:       c3                      ret
> > > > > 
> > > > > The corresponding C code is as follows:
> > > > > 
> > > > > static void
> > > > > rcu_stutter_wait(void)
> > > > > {
> > > > > 	while ((stutter_pause_test || !rcutorture_runnable) && !fullstop) {
> > > > > 		if (rcutorture_runnable)
> > > > > 			schedule_timeout_interruptible(1);
> > > > > 		else
> > > > > 			schedule_timeout_interruptible(round_jiffies_relative(HZ));
> > > > > 	}
> > > > > }
> > > > > 
> > > > > I don't see much opportunity for a page fault here...  This is the
> > > > > binary I get when I compile it, though not as a module:
> > > > > 
> > > > > 0000085a <rcu_stutter_wait>:
> > > > >      85a:	55                   	push   %ebp
> > > > >      85b:	89 e5                	mov    %esp,%ebp
> > > > >      85d:	eb 1d                	jmp    87c <rcu_stutter_wait+0x22>
> > > > >      85f:	83 3d 00 00 00 00 00 	cmpl   $0x0,0x0
> > > > >      866:	b8 01 00 00 00       	mov    $0x1,%eax
> > > > >      86b:	75 0a                	jne    877 <rcu_stutter_wait+0x1d>
> > > > >      86d:	b8 e8 03 00 00       	mov    $0x3e8,%eax
> > > > >      872:	e8 fc ff ff ff       	call   873 <rcu_stutter_wait+0x19>
> > > > >      877:	e8 fc ff ff ff       	call   878 <rcu_stutter_wait+0x1e>
> > > > >      87c:	83 3d 14 00 00 00 00 	cmpl   $0x0,0x14
> > > > >      883:	75 09                	jne    88e <rcu_stutter_wait+0x34>
> > > > >      885:	83 3d 00 00 00 00 00 	cmpl   $0x0,0x0
> > > > >      88c:	75 09                	jne    897 <rcu_stutter_wait+0x3d>
> > > > >      88e:	83 3d 08 1a 00 00 00 	cmpl   $0x0,0x1a08
> > > > >      895:	74 c8                	je     85f <rcu_stutter_wait+0x5>
> > > > >      897:	5d                   	pop    %ebp
> > > > >      898:	c3                   	ret    
> > > > > 
> > > > > I confess, I am confused!!!
> > > > 
> > > > on the other box with a different gcc version
> > > > 
> > > > gcc version 4.3.2 (Ubuntu 4.3.2-1ubuntu11) 
> > > > 
> > > > d1902e90 is the start of rcu_stutter_wait
> > > > 
> > > > [  533.391719] d087e000 d1902e90
> > > > [  533.392294] rcu-torture:--- Start of test: nreaders=2 nfakewriters=4 stat_interval=0 verbose=0 test_no_idle_hz=0 shuffle_interval=3 stutter=5 irqreader=1
> > > > [  541.000139] BUG: unable to handle kernel paging request at d1902efd
> > > > [  541.000423] IP: [<d1902efd>] 0xd1902efd
> > > > [  541.000660] *pde = 0f08f067 *pte = 00000000 
> > > > [  541.000867] Oops: 0000 [#1] DEBUG_PAGEALLOC
> > > > [  541.001126] last sysfs file: /sys/block/sda/size
> > > > [  541.001246] Modules linked in: nfsd exportfs nfs lockd nfs_acl auth_rpcgss sunrpc ipv6 fuse unix [last unloaded: rcutorture]
> > > > [  541.002235] 
> > > > [  541.002334] Pid: 5292, comm: rcu_torture_wri Not tainted (2.6.28 #84) 
> > > > [  541.002470] EIP: 0060:[<d1902efd>] EFLAGS: 00010296 CPU: 0
> > > > [  541.002598] EIP is at 0xd1902efd
> > > > [  541.002767] EAX: 00000000 EBX: d19073c0 ECX: 00000000 EDX: 00000000
> > > > [  541.002900] ESI: 0000000a EDI: 00000000 EBP: c7b63fb8 ESP: c7b63fb8
> > > > [  541.003033]  DS: 007b ES: 007b FS: 0000 GS: 0000 SS: 0068
> > > > [  541.003160] Process rcu_torture_wri (pid: 5292, ti=c7b63000 task=c7b09710 task.ti=c7b63000)
> > > > [  541.003400] Stack:
> > > > [  541.003497]  c7b63fd0 d19032c1 00000000 00000000 00000000 d1903200 c7b63fe0 c013d80a
> > > > [  541.004022]  c013d7d0 00000000 00000000 c0103cf3 cef6ee70 00000000 00000000 00000000
> > > > [  541.004022]  00000201 000004b4
> > > > [  541.004022] Call Trace:
> > > > [  541.004022]  [<c013d80a>] ? kthread+0x3a/0x70
> > > > [  541.004022]  [<c013d7d0>] ? kthread+0x0/0x70
> > > > [  541.004022]  [<c0103cf3>] ? kernel_thread_helper+0x7/0x14
> > > > [  541.004022] Code:  Bad EIP value.
> > > > [  541.004022] EIP: [<d1902efd>] 0xd1902efd SS:ESP 0068:c7b63fb8
> > > > [  541.004022] ---[ end trace cb3b10c2bb94b4e3 ]---
> > > > 
> > > > 
> > > > 00000e90 <rcu_stutter_wait>:
> > > >      e90:	55                   	push   %ebp
> > > >      e91:	89 e5                	mov    %esp,%ebp
> > > >      e93:	90                   	nop    
> > > >      e94:	8d 74 26 00          	lea    0x0(%esi,%eiz,1),%esi
> > > >      e98:	a1 98 00 00 00       	mov    0x98,%eax
> > > >      e9d:	85 c0                	test   %eax,%eax
> > > >      e9f:	75 09                	jne    eaa <rcu_stutter_wait+0x1a>
> > > >      ea1:	a1 00 00 00 00       	mov    0x0,%eax
> > > >      ea6:	85 c0                	test   %eax,%eax
> > > >      ea8:	75 36                	jne    ee0 <rcu_stutter_wait+0x50>
> > > >      eaa:	a1 88 1a 00 00       	mov    0x1a88,%eax
> > > >      eaf:	85 c0                	test   %eax,%eax
> > > >      eb1:	75 2d                	jne    ee0 <rcu_stutter_wait+0x50>
> > > >      eb3:	8b 15 00 00 00 00    	mov    0x0,%edx
> > > >      eb9:	85 d2                	test   %edx,%edx
> > > >      ebb:	74 2b                	je     ee8 <rcu_stutter_wait+0x58>
> > > >      ebd:	b8 01 00 00 00       	mov    $0x1,%eax
> > > >      ec2:	e8 fc ff ff ff       	call   ec3 <rcu_stutter_wait+0x33>
> > > >      ec7:	a1 98 00 00 00       	mov    0x98,%eax
> > > >      ecc:	85 c0                	test   %eax,%eax
> > > >      ece:	74 d1                	je     ea1 <rcu_stutter_wait+0x11>
> > > >      ed0:	a1 88 1a 00 00       	mov    0x1a88,%eax
> > > >      ed5:	85 c0                	test   %eax,%eax
> > > >      ed7:	74 da                	je     eb3 <rcu_stutter_wait+0x23>
> > > >      ed9:	8d b4 26 00 00 00 00 	lea    0x0(%esi,%eiz,1),%esi
> > > >      ee0:	5d                   	pop    %ebp
> > > >      ee1:	c3                   	ret    
> > > >      ee2:	8d b6 00 00 00 00    	lea    0x0(%esi),%esi
> > > >      ee8:	b8 fa 00 00 00       	mov    $0xfa,%eax
> > > >      eed:	e8 fc ff ff ff       	call   eee <rcu_stutter_wait+0x5e>
> > > 
> > > Here we are again calling one byte into the current instruction!!!
> > > 
> > > Or am I misinterpreting this code?
> > > 
> > > >      ef2:	8d b6 00 00 00 00    	lea    0x0(%esi),%esi
> > > >      ef8:	e8 fc ff ff ff       	call   ef9 <rcu_stutter_wait+0x69>
> > > >      efd:	8d 76 00             	lea    0x0(%esi),%esi
> > > > 			   ^------------- here
> > > > 
> > > > This one looks more like it can explain a page fault
> > > 
> > > I don't understand why there are indirections in the assembly given the
> > > C code for rcu_stutter_wait().
> > > 
> > > >      f00:	eb 96                	jmp    e98 <rcu_stutter_wait+0x8>
> > > >      f02:	8d b4 26 00 00 00 00 	lea    0x0(%esi,%eiz,1),%esi
> > > >      f09:	8d bc 27 00 00 00 00 	lea    0x0(%edi,%eiz,1),%edi
> > 
> > ok, after trying to find out if the ubuntu gccs are broken, i stumbled
> > upon this:
> > http://forum.soft32.com/linux/Strange-problem-disassembling-shared-lib-ftopict439936.html
> > 
> > Seems the difference is that you dont compile it as a module and the
> > jump is perfectly normal, it gets overwritten when the stuff is loaded
> > objdump -dr gives me
> > 
> > 00000e90 <rcu_stutter_wait>:
> >      e90:	55                   	push   %ebp
> >      e91:	89 e5                	mov    %esp,%ebp
> >      e93:	90                   	nop    
> >      e94:	8d 74 26 00          	lea    0x0(%esi,%eiz,1),%esi
> >      e98:	a1 98 00 00 00       	mov    0x98,%eax
> > 			e99: R_386_32	.bss
> >      e9d:	85 c0                	test   %eax,%eax
> >      e9f:	75 09                	jne    eaa <rcu_stutter_wait+0x1a>
> >      ea1:	a1 00 00 00 00       	mov    0x0,%eax
> > 			ea2: R_386_32	rcutorture_runnable
> >      ea6:	85 c0                	test   %eax,%eax
> >      ea8:	75 36                	jne    ee0 <rcu_stutter_wait+0x50>
> >      eaa:	a1 88 1a 00 00       	mov    0x1a88,%eax
> > 			eab: R_386_32	.bss
> >      eaf:	85 c0                	test   %eax,%eax
> >      eb1:	75 2d                	jne    ee0 <rcu_stutter_wait+0x50>
> >      eb3:	8b 15 00 00 00 00    	mov    0x0,%edx
> > 			eb5: R_386_32	rcutorture_runnable
> >      eb9:	85 d2                	test   %edx,%edx
> >      ebb:	74 2b                	je     ee8 <rcu_stutter_wait+0x58>
> >      ebd:	b8 01 00 00 00       	mov    $0x1,%eax
> >      ec2:	e8 fc ff ff ff       	call   ec3 <rcu_stutter_wait+0x33>
> > 			ec3: R_386_PC32	schedule_timeout_interruptible
> >      ec7:	a1 98 00 00 00       	mov    0x98,%eax
> > 			ec8: R_386_32	.bss
> >      ecc:	85 c0                	test   %eax,%eax
> >      ece:	74 d1                	je     ea1 <rcu_stutter_wait+0x11>
> >      ed0:	a1 88 1a 00 00       	mov    0x1a88,%eax
> > 			ed1: R_386_32	.bss
> >      ed5:	85 c0                	test   %eax,%eax
> >      ed7:	74 da                	je     eb3 <rcu_stutter_wait+0x23>
> >      ed9:	8d b4 26 00 00 00 00 	lea    0x0(%esi,%eiz,1),%esi
> >      ee0:	5d                   	pop    %ebp
> >      ee1:	c3                   	ret    
> >      ee2:	8d b6 00 00 00 00    	lea    0x0(%esi),%esi
> >      ee8:	b8 fa 00 00 00       	mov    $0xfa,%eax
> >      eed:	e8 fc ff ff ff       	call   eee <rcu_stutter_wait+0x5e>
> > 			eee: R_386_PC32	round_jiffies_relative
> >      ef2:	8d b6 00 00 00 00    	lea    0x0(%esi),%esi
> >      ef8:	e8 fc ff ff ff       	call   ef9 <rcu_stutter_wait+0x69>
> > 			ef9: R_386_PC32	schedule_timeout_interruptible
> >      efd:	8d 76 00             	lea    0x0(%esi),%esi
> > 
> > here is the deref ------------------------^
> 
> Ah!!!  We are getting a page fault while cleaning up the stack frame?
> 
> Ouch!
> 
> >      f00:	eb 96                	jmp    e98 <rcu_stutter_wait+0x8>
> >      f02:	8d b4 26 00 00 00 00 	lea    0x0(%esi,%eiz,1),%esi
> >      f09:	8d bc 27 00 00 00 00 	lea    0x0(%edi,%eiz,1),%edi

And now I can run this.  gcc v3.4.4 gives yet a different error:

divide error: 0000 [#1] PREEMPT SMP 

last sysfs file: /sys/devices/pci0000:00/0000:00:0a.0/0000:02:04.0/host0/target0:0:6/0:0:6:0/type

CPU 1 

Modules linked in: [last unloaded: rcutorture]

Pid: 3369, comm: rcu_torture_rea Not tainted 2.6.28-autokern1 #1

RIP: 0010:[<ffffffffa0000142>]  [<ffffffffa0000142>] 0xffffffffa0000142

RSP: 0000:ffff88007f0afeb0  EFLAGS: 00010246

RAX: 00000000def36d6a RBX: ffffffffa0006850 RCX: 0000000000000000

RDX: 0000000000000000 RSI: ffff88007ecf1600 RDI: ffff88007f0afef0

RBP: 0000000000000dd4 R08: ffff88007f0ae000 R09: ffffffff8037c7d9

R10: 0000000000000000 R11: ffffffff8037c7d9 R12: 0000000000000000

R13: 000000000000000a R14: 0000000000000000 R15: 0000000000000000

FS:  0000000000000000(0000) GS:ffff8800e3801c00(0000) knlGS:00000000f7f3ab80

CS:  0010 DS: 0018 ES: 0018 CR0: 000000008005003b

CR2: 00000000080dc87c CR3: 0000000000201000 CR4: 00000000000006e0

DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000

DR3: 0000000000000000 DR6: 00000000ffff0ff0 DR7: 0000000000000400

Process rcu_torture_rea (pid: 3369, threadinfo ffff88007f0ae000, task ffff88007ecf5d80)

Stack:

 ffffffff8037c7d9 ffffffffa000090d ffff88007ecfdec0 ffff88007ec17eb0

 0000000000000001 ffffffffa00006e6 0000000000000000 ffff8800e3488000

 c744ca136d6adef3 0000000000000256 0000000000000000 ffffffffa0000807

Call Trace:

 [<ffffffff8037c7d9>] ? delay_tsc+0x0/0x99

 [<ffffffff80245b94>] ? kthread+0x3d/0x63

 [<ffffffff8020c3ea>] ? child_rip+0xa/0x20

 [<ffffffff80245b57>] ? kthread+0x0/0x63

 [<ffffffff8020c3e0>] ? child_rip+0x0/0x20

Code: 58 c3 56 bf 01 00 00 00 e8 88 bd 22 e0 59 31 c0 c3 41 51 e8 69 ff ff ff 48 63 15 0a 58 00 00 48 69 d2 90 01 00 00 48 89 d1 31 d2 <48> f7 f1 48 85 d2 75 0c 41 58 bf 78 1b 0d 00 e9 32 c7 37 e0 5f 

RIP  [<ffffffffa0000142>] 0xffffffffa0000142

 RSP <ffff88007f0afeb0>


I will see what I can find from this.

							Thanx, Paul

^ permalink raw reply	[flat|nested] 27+ messages in thread

* Re: [BUG] NULL pointer deref with rcutorture
  2009-01-06  0:29                                       ` Paul E. McKenney
@ 2009-01-06  2:15                                         ` Paul E. McKenney
  2009-01-06  7:47                                           ` Eric Sesterhenn
  0 siblings, 1 reply; 27+ messages in thread
From: Paul E. McKenney @ 2009-01-06  2:15 UTC (permalink / raw)
  To: Eric Sesterhenn; +Cc: Kamalesh Babulal, linux-kernel, josh, dipankar

[-- Attachment #1: Type: text/plain, Size: 14592 bytes --]

On Mon, Jan 05, 2009 at 04:29:12PM -0800, Paul E. McKenney wrote:
> On Mon, Jan 05, 2009 at 02:18:36PM -0800, Paul E. McKenney wrote:
> > On Mon, Jan 05, 2009 at 09:31:53PM +0100, Eric Sesterhenn wrote:
> > > * Paul E. McKenney (paulmck@linux.vnet.ibm.com) wrote:
> > > > On Mon, Jan 05, 2009 at 09:01:45PM +0100, Eric Sesterhenn wrote:
> > > > > hi,
> > > > > 
> > > > > * Paul E. McKenney (paulmck@linux.vnet.ibm.com) wrote:
> > > > > > On Mon, Jan 05, 2009 at 07:56:55PM +0100, Eric Sesterhenn wrote:
> > > > > > 
> > > > > > Wow!!!  Am I reading this correctly?  Does the above "call" instruction
> > > > > > -really- call one byte into itself?  That is what the hex for the x86
> > > > > > instruction -looks- like it is doing, but I cannot see what would have
> > > > > > possessed the compiler to generate this code.
> > > > > 
> > > > > Compiler is gcc version 4.2.4 (Ubuntu 4.2.4-1ubuntu3)
> > > > 
> > > > I am using 4.1.3, for whatever it is worth.  (Ancient, I know!)
> > > > 
> > > > > > When I compile on a 32-bit x86 machine, I don't see the above "call"
> > > > > > instruction.  Other than that, the code I see looks consistent.
> > > > > > 
> > > > > > >      9f0:       eb 1d                   jmp    a0f <rcu_stutter_wait+0x27>
> > > > > > >      9f2:       83 3d 00 00 00 00 00    cmpl   $0x0,0x0
> > > > > > >      9f9:       b8 01 00 00 00          mov    $0x1,%eax
> > > > > > >      9fe:       75 0a                   jne    a0a <rcu_stutter_wait+0x22>
> > > > > > >      a00:       b8 e8 03 00 00          mov    $0x3e8,%eax
> > > > > > >      a05:       e8 fc ff ff ff          call   a06 <rcu_stutter_wait+0x1e>
> > > > > > >      a0a:       e8 fc ff ff ff          call   a0b <rcu_stutter_wait+0x23>
> > > > > > >      a0f:       83 3d 6c 00 00 00 00    cmpl   $0x0,0x6c
> > > > > > > 			^---------- this line
> > > > > > 
> > > > > > This looks like the first test in the "while" loop.
> > > > > > 
> > > > > > >      a16:       75 09                   jne    a21 <rcu_stutter_wait+0x39>
> > > > > > >      a18:       83 3d 00 00 00 00 00    cmpl   $0x0,0x0
> > > > > > >      a1f:       75 09                   jne    a2a <rcu_stutter_wait+0x42>
> > > > > > >      a21:       83 3d 50 1a 00 00 00    cmpl   $0x0,0x1a50
> > > > > > >      a28:       74 c8                   je     9f2 <rcu_stutter_wait+0xa>
> > > > > > >      a2a:       5d                      pop    %ebp
> > > > > > >      a2b:       c3                      ret
> > > > > > 
> > > > > > The corresponding C code is as follows:
> > > > > > 
> > > > > > static void
> > > > > > rcu_stutter_wait(void)
> > > > > > {
> > > > > > 	while ((stutter_pause_test || !rcutorture_runnable) && !fullstop) {
> > > > > > 		if (rcutorture_runnable)
> > > > > > 			schedule_timeout_interruptible(1);
> > > > > > 		else
> > > > > > 			schedule_timeout_interruptible(round_jiffies_relative(HZ));
> > > > > > 	}
> > > > > > }
> > > > > > 
> > > > > > I don't see much opportunity for a page fault here...  This is the
> > > > > > binary I get when I compile it, though not as a module:
> > > > > > 
> > > > > > 0000085a <rcu_stutter_wait>:
> > > > > >      85a:	55                   	push   %ebp
> > > > > >      85b:	89 e5                	mov    %esp,%ebp
> > > > > >      85d:	eb 1d                	jmp    87c <rcu_stutter_wait+0x22>
> > > > > >      85f:	83 3d 00 00 00 00 00 	cmpl   $0x0,0x0
> > > > > >      866:	b8 01 00 00 00       	mov    $0x1,%eax
> > > > > >      86b:	75 0a                	jne    877 <rcu_stutter_wait+0x1d>
> > > > > >      86d:	b8 e8 03 00 00       	mov    $0x3e8,%eax
> > > > > >      872:	e8 fc ff ff ff       	call   873 <rcu_stutter_wait+0x19>
> > > > > >      877:	e8 fc ff ff ff       	call   878 <rcu_stutter_wait+0x1e>
> > > > > >      87c:	83 3d 14 00 00 00 00 	cmpl   $0x0,0x14
> > > > > >      883:	75 09                	jne    88e <rcu_stutter_wait+0x34>
> > > > > >      885:	83 3d 00 00 00 00 00 	cmpl   $0x0,0x0
> > > > > >      88c:	75 09                	jne    897 <rcu_stutter_wait+0x3d>
> > > > > >      88e:	83 3d 08 1a 00 00 00 	cmpl   $0x0,0x1a08
> > > > > >      895:	74 c8                	je     85f <rcu_stutter_wait+0x5>
> > > > > >      897:	5d                   	pop    %ebp
> > > > > >      898:	c3                   	ret    
> > > > > > 
> > > > > > I confess, I am confused!!!
> > > > > 
> > > > > on the other box with a different gcc version
> > > > > 
> > > > > gcc version 4.3.2 (Ubuntu 4.3.2-1ubuntu11) 
> > > > > 
> > > > > d1902e90 is the start of rcu_stutter_wait
> > > > > 
> > > > > [  533.391719] d087e000 d1902e90
> > > > > [  533.392294] rcu-torture:--- Start of test: nreaders=2 nfakewriters=4 stat_interval=0 verbose=0 test_no_idle_hz=0 shuffle_interval=3 stutter=5 irqreader=1
> > > > > [  541.000139] BUG: unable to handle kernel paging request at d1902efd
> > > > > [  541.000423] IP: [<d1902efd>] 0xd1902efd
> > > > > [  541.000660] *pde = 0f08f067 *pte = 00000000 
> > > > > [  541.000867] Oops: 0000 [#1] DEBUG_PAGEALLOC
> > > > > [  541.001126] last sysfs file: /sys/block/sda/size
> > > > > [  541.001246] Modules linked in: nfsd exportfs nfs lockd nfs_acl auth_rpcgss sunrpc ipv6 fuse unix [last unloaded: rcutorture]
> > > > > [  541.002235] 
> > > > > [  541.002334] Pid: 5292, comm: rcu_torture_wri Not tainted (2.6.28 #84) 
> > > > > [  541.002470] EIP: 0060:[<d1902efd>] EFLAGS: 00010296 CPU: 0
> > > > > [  541.002598] EIP is at 0xd1902efd
> > > > > [  541.002767] EAX: 00000000 EBX: d19073c0 ECX: 00000000 EDX: 00000000
> > > > > [  541.002900] ESI: 0000000a EDI: 00000000 EBP: c7b63fb8 ESP: c7b63fb8
> > > > > [  541.003033]  DS: 007b ES: 007b FS: 0000 GS: 0000 SS: 0068
> > > > > [  541.003160] Process rcu_torture_wri (pid: 5292, ti=c7b63000 task=c7b09710 task.ti=c7b63000)
> > > > > [  541.003400] Stack:
> > > > > [  541.003497]  c7b63fd0 d19032c1 00000000 00000000 00000000 d1903200 c7b63fe0 c013d80a
> > > > > [  541.004022]  c013d7d0 00000000 00000000 c0103cf3 cef6ee70 00000000 00000000 00000000
> > > > > [  541.004022]  00000201 000004b4
> > > > > [  541.004022] Call Trace:
> > > > > [  541.004022]  [<c013d80a>] ? kthread+0x3a/0x70
> > > > > [  541.004022]  [<c013d7d0>] ? kthread+0x0/0x70
> > > > > [  541.004022]  [<c0103cf3>] ? kernel_thread_helper+0x7/0x14
> > > > > [  541.004022] Code:  Bad EIP value.
> > > > > [  541.004022] EIP: [<d1902efd>] 0xd1902efd SS:ESP 0068:c7b63fb8
> > > > > [  541.004022] ---[ end trace cb3b10c2bb94b4e3 ]---
> > > > > 
> > > > > 
> > > > > 00000e90 <rcu_stutter_wait>:
> > > > >      e90:	55                   	push   %ebp
> > > > >      e91:	89 e5                	mov    %esp,%ebp
> > > > >      e93:	90                   	nop    
> > > > >      e94:	8d 74 26 00          	lea    0x0(%esi,%eiz,1),%esi
> > > > >      e98:	a1 98 00 00 00       	mov    0x98,%eax
> > > > >      e9d:	85 c0                	test   %eax,%eax
> > > > >      e9f:	75 09                	jne    eaa <rcu_stutter_wait+0x1a>
> > > > >      ea1:	a1 00 00 00 00       	mov    0x0,%eax
> > > > >      ea6:	85 c0                	test   %eax,%eax
> > > > >      ea8:	75 36                	jne    ee0 <rcu_stutter_wait+0x50>
> > > > >      eaa:	a1 88 1a 00 00       	mov    0x1a88,%eax
> > > > >      eaf:	85 c0                	test   %eax,%eax
> > > > >      eb1:	75 2d                	jne    ee0 <rcu_stutter_wait+0x50>
> > > > >      eb3:	8b 15 00 00 00 00    	mov    0x0,%edx
> > > > >      eb9:	85 d2                	test   %edx,%edx
> > > > >      ebb:	74 2b                	je     ee8 <rcu_stutter_wait+0x58>
> > > > >      ebd:	b8 01 00 00 00       	mov    $0x1,%eax
> > > > >      ec2:	e8 fc ff ff ff       	call   ec3 <rcu_stutter_wait+0x33>
> > > > >      ec7:	a1 98 00 00 00       	mov    0x98,%eax
> > > > >      ecc:	85 c0                	test   %eax,%eax
> > > > >      ece:	74 d1                	je     ea1 <rcu_stutter_wait+0x11>
> > > > >      ed0:	a1 88 1a 00 00       	mov    0x1a88,%eax
> > > > >      ed5:	85 c0                	test   %eax,%eax
> > > > >      ed7:	74 da                	je     eb3 <rcu_stutter_wait+0x23>
> > > > >      ed9:	8d b4 26 00 00 00 00 	lea    0x0(%esi,%eiz,1),%esi
> > > > >      ee0:	5d                   	pop    %ebp
> > > > >      ee1:	c3                   	ret    
> > > > >      ee2:	8d b6 00 00 00 00    	lea    0x0(%esi),%esi
> > > > >      ee8:	b8 fa 00 00 00       	mov    $0xfa,%eax
> > > > >      eed:	e8 fc ff ff ff       	call   eee <rcu_stutter_wait+0x5e>
> > > > 
> > > > Here we are again calling one byte into the current instruction!!!
> > > > 
> > > > Or am I misinterpreting this code?
> > > > 
> > > > >      ef2:	8d b6 00 00 00 00    	lea    0x0(%esi),%esi
> > > > >      ef8:	e8 fc ff ff ff       	call   ef9 <rcu_stutter_wait+0x69>
> > > > >      efd:	8d 76 00             	lea    0x0(%esi),%esi
> > > > > 			   ^------------- here
> > > > > 
> > > > > This one looks more like it can explain a page fault
> > > > 
> > > > I don't understand why there are indirections in the assembly given the
> > > > C code for rcu_stutter_wait().
> > > > 
> > > > >      f00:	eb 96                	jmp    e98 <rcu_stutter_wait+0x8>
> > > > >      f02:	8d b4 26 00 00 00 00 	lea    0x0(%esi,%eiz,1),%esi
> > > > >      f09:	8d bc 27 00 00 00 00 	lea    0x0(%edi,%eiz,1),%edi
> > > 
> > > ok, after trying to find out if the ubuntu gccs are broken, i stumbled
> > > upon this:
> > > http://forum.soft32.com/linux/Strange-problem-disassembling-shared-lib-ftopict439936.html
> > > 
> > > Seems the difference is that you dont compile it as a module and the
> > > jump is perfectly normal, it gets overwritten when the stuff is loaded
> > > objdump -dr gives me
> > > 
> > > 00000e90 <rcu_stutter_wait>:
> > >      e90:	55                   	push   %ebp
> > >      e91:	89 e5                	mov    %esp,%ebp
> > >      e93:	90                   	nop    
> > >      e94:	8d 74 26 00          	lea    0x0(%esi,%eiz,1),%esi
> > >      e98:	a1 98 00 00 00       	mov    0x98,%eax
> > > 			e99: R_386_32	.bss
> > >      e9d:	85 c0                	test   %eax,%eax
> > >      e9f:	75 09                	jne    eaa <rcu_stutter_wait+0x1a>
> > >      ea1:	a1 00 00 00 00       	mov    0x0,%eax
> > > 			ea2: R_386_32	rcutorture_runnable
> > >      ea6:	85 c0                	test   %eax,%eax
> > >      ea8:	75 36                	jne    ee0 <rcu_stutter_wait+0x50>
> > >      eaa:	a1 88 1a 00 00       	mov    0x1a88,%eax
> > > 			eab: R_386_32	.bss
> > >      eaf:	85 c0                	test   %eax,%eax
> > >      eb1:	75 2d                	jne    ee0 <rcu_stutter_wait+0x50>
> > >      eb3:	8b 15 00 00 00 00    	mov    0x0,%edx
> > > 			eb5: R_386_32	rcutorture_runnable
> > >      eb9:	85 d2                	test   %edx,%edx
> > >      ebb:	74 2b                	je     ee8 <rcu_stutter_wait+0x58>
> > >      ebd:	b8 01 00 00 00       	mov    $0x1,%eax
> > >      ec2:	e8 fc ff ff ff       	call   ec3 <rcu_stutter_wait+0x33>
> > > 			ec3: R_386_PC32	schedule_timeout_interruptible
> > >      ec7:	a1 98 00 00 00       	mov    0x98,%eax
> > > 			ec8: R_386_32	.bss
> > >      ecc:	85 c0                	test   %eax,%eax
> > >      ece:	74 d1                	je     ea1 <rcu_stutter_wait+0x11>
> > >      ed0:	a1 88 1a 00 00       	mov    0x1a88,%eax
> > > 			ed1: R_386_32	.bss
> > >      ed5:	85 c0                	test   %eax,%eax
> > >      ed7:	74 da                	je     eb3 <rcu_stutter_wait+0x23>
> > >      ed9:	8d b4 26 00 00 00 00 	lea    0x0(%esi,%eiz,1),%esi
> > >      ee0:	5d                   	pop    %ebp
> > >      ee1:	c3                   	ret    
> > >      ee2:	8d b6 00 00 00 00    	lea    0x0(%esi),%esi
> > >      ee8:	b8 fa 00 00 00       	mov    $0xfa,%eax
> > >      eed:	e8 fc ff ff ff       	call   eee <rcu_stutter_wait+0x5e>
> > > 			eee: R_386_PC32	round_jiffies_relative
> > >      ef2:	8d b6 00 00 00 00    	lea    0x0(%esi),%esi
> > >      ef8:	e8 fc ff ff ff       	call   ef9 <rcu_stutter_wait+0x69>
> > > 			ef9: R_386_PC32	schedule_timeout_interruptible
> > >      efd:	8d 76 00             	lea    0x0(%esi),%esi
> > > 
> > > here is the deref ------------------------^
> > 
> > Ah!!!  We are getting a page fault while cleaning up the stack frame?
> > 
> > Ouch!
> > 
> > >      f00:	eb 96                	jmp    e98 <rcu_stutter_wait+0x8>
> > >      f02:	8d b4 26 00 00 00 00 	lea    0x0(%esi,%eiz,1),%esi
> > >      f09:	8d bc 27 00 00 00 00 	lea    0x0(%edi,%eiz,1),%edi
> 
> And now I can run this.  gcc v3.4.4 gives yet a different error:
> 
> divide error: 0000 [#1] PREEMPT SMP 
> last sysfs file: /sys/devices/pci0000:00/0000:00:0a.0/0000:02:04.0/host0/target0:0:6/0:0:6:0/type
> CPU 1 
> Modules linked in: [last unloaded: rcutorture]
> Pid: 3369, comm: rcu_torture_rea Not tainted 2.6.28-autokern1 #1
> RIP: 0010:[<ffffffffa0000142>]  [<ffffffffa0000142>] 0xffffffffa0000142
> RSP: 0000:ffff88007f0afeb0  EFLAGS: 00010246
> RAX: 00000000def36d6a RBX: ffffffffa0006850 RCX: 0000000000000000
> RDX: 0000000000000000 RSI: ffff88007ecf1600 RDI: ffff88007f0afef0
> RBP: 0000000000000dd4 R08: ffff88007f0ae000 R09: ffffffff8037c7d9
> R10: 0000000000000000 R11: ffffffff8037c7d9 R12: 0000000000000000
> R13: 000000000000000a R14: 0000000000000000 R15: 0000000000000000
> FS:  0000000000000000(0000) GS:ffff8800e3801c00(0000) knlGS:00000000f7f3ab80
> CS:  0010 DS: 0018 ES: 0018 CR0: 000000008005003b
> CR2: 00000000080dc87c CR3: 0000000000201000 CR4: 00000000000006e0
> DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
> DR3: 0000000000000000 DR6: 00000000ffff0ff0 DR7: 0000000000000400
> Process rcu_torture_rea (pid: 3369, threadinfo ffff88007f0ae000, task ffff88007ecf5d80)
> Stack:
>  ffffffff8037c7d9 ffffffffa000090d ffff88007ecfdec0 ffff88007ec17eb0
>  0000000000000001 ffffffffa00006e6 0000000000000000 ffff8800e3488000
>  c744ca136d6adef3 0000000000000256 0000000000000000 ffffffffa0000807
> Call Trace:
>  [<ffffffff8037c7d9>] ? delay_tsc+0x0/0x99
>  [<ffffffff80245b94>] ? kthread+0x3d/0x63
>  [<ffffffff8020c3ea>] ? child_rip+0xa/0x20
>  [<ffffffff80245b57>] ? kthread+0x0/0x63
>  [<ffffffff8020c3e0>] ? child_rip+0x0/0x20
> Code: 58 c3 56 bf 01 00 00 00 e8 88 bd 22 e0 59 31 c0 c3 41 51 e8 69 ff ff ff 48 63 15 0a 58 00 00 48 69 d2 90 01 00 00 48 89 d1 31 d2 <48> f7 f1 48 85 d2 75 0c 41 58 bf 78 1b 0d 00 e9 32 c7 37 e0 5f 
> RIP  [<ffffffffa0000142>] 0xffffffffa0000142
>  RSP <ffff88007f0afeb0>
> 
> I will see what I can find from this.

Hello, Eric,

Just for grins, I built the 2.6.28 kernel/rcutorture.c on b58602a.  It
insmoded and rmmoded without errors.

Could you please try this on your setup?  I have attached this file in
case that helps.

							Thanx, Paul

[-- Attachment #2: kernel/rcutorture.c --]
[-- Type: text/x-csrc, Size: 31855 bytes --]

/*
 * Read-Copy Update module-based torture test facility
 *
 * This program is free software; you can redistribute it and/or modify
 * it under the terms of the GNU General Public License as published by
 * the Free Software Foundation; either version 2 of the License, or
 * (at your option) any later version.
 *
 * This program is distributed in the hope that it will be useful,
 * but WITHOUT ANY WARRANTY; without even the implied warranty of
 * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
 * GNU General Public License for more details.
 *
 * You should have received a copy of the GNU General Public License
 * along with this program; if not, write to the Free Software
 * Foundation, Inc., 59 Temple Place - Suite 330, Boston, MA 02111-1307, USA.
 *
 * Copyright (C) IBM Corporation, 2005, 2006
 *
 * Authors: Paul E. McKenney <paulmck@us.ibm.com>
 *          Josh Triplett <josh@freedesktop.org>
 *
 * See also:  Documentation/RCU/torture.txt
 */
#include <linux/types.h>
#include <linux/kernel.h>
#include <linux/init.h>
#include <linux/module.h>
#include <linux/kthread.h>
#include <linux/err.h>
#include <linux/spinlock.h>
#include <linux/smp.h>
#include <linux/rcupdate.h>
#include <linux/interrupt.h>
#include <linux/sched.h>
#include <asm/atomic.h>
#include <linux/bitops.h>
#include <linux/completion.h>
#include <linux/moduleparam.h>
#include <linux/percpu.h>
#include <linux/notifier.h>
#include <linux/freezer.h>
#include <linux/cpu.h>
#include <linux/delay.h>
#include <linux/stat.h>
#include <linux/srcu.h>
#include <linux/slab.h>
#include <asm/byteorder.h>

MODULE_LICENSE("GPL");
MODULE_AUTHOR("Paul E. McKenney <paulmck@us.ibm.com> and "
              "Josh Triplett <josh@freedesktop.org>");

static int nreaders = -1;	/* # reader threads, defaults to 2*ncpus */
static int nfakewriters = 4;	/* # fake writer threads */
static int stat_interval;	/* Interval between stats, in seconds. */
				/*  Defaults to "only at end of test". */
static int verbose;		/* Print more debug info. */
static int test_no_idle_hz;	/* Test RCU's support for tickless idle CPUs. */
static int shuffle_interval = 3; /* Interval between shuffles (in sec)*/
static int stutter = 5;		/* Start/stop testing interval (in sec) */
static int irqreader = 1;	/* RCU readers from irq (timers). */
static char *torture_type = "rcu"; /* What RCU implementation to torture. */

module_param(nreaders, int, 0444);
MODULE_PARM_DESC(nreaders, "Number of RCU reader threads");
module_param(nfakewriters, int, 0444);
MODULE_PARM_DESC(nfakewriters, "Number of RCU fake writer threads");
module_param(stat_interval, int, 0444);
MODULE_PARM_DESC(stat_interval, "Number of seconds between stats printk()s");
module_param(verbose, bool, 0444);
MODULE_PARM_DESC(verbose, "Enable verbose debugging printk()s");
module_param(test_no_idle_hz, bool, 0444);
MODULE_PARM_DESC(test_no_idle_hz, "Test support for tickless idle CPUs");
module_param(shuffle_interval, int, 0444);
MODULE_PARM_DESC(shuffle_interval, "Number of seconds between shuffles");
module_param(stutter, int, 0444);
MODULE_PARM_DESC(stutter, "Number of seconds to run/halt test");
module_param(irqreader, int, 0444);
MODULE_PARM_DESC(irqreader, "Allow RCU readers from irq handlers");
module_param(torture_type, charp, 0444);
MODULE_PARM_DESC(torture_type, "Type of RCU to torture (rcu, rcu_bh, srcu)");

#define TORTURE_FLAG "-torture:"
#define PRINTK_STRING(s) \
	do { printk(KERN_ALERT "%s" TORTURE_FLAG s "\n", torture_type); } while (0)
#define VERBOSE_PRINTK_STRING(s) \
	do { if (verbose) printk(KERN_ALERT "%s" TORTURE_FLAG s "\n", torture_type); } while (0)
#define VERBOSE_PRINTK_ERRSTRING(s) \
	do { if (verbose) printk(KERN_ALERT "%s" TORTURE_FLAG "!!! " s "\n", torture_type); } while (0)

static char printk_buf[4096];

static int nrealreaders;
static struct task_struct *writer_task;
static struct task_struct **fakewriter_tasks;
static struct task_struct **reader_tasks;
static struct task_struct *stats_task;
static struct task_struct *shuffler_task;
static struct task_struct *stutter_task;

#define RCU_TORTURE_PIPE_LEN 10

struct rcu_torture {
	struct rcu_head rtort_rcu;
	int rtort_pipe_count;
	struct list_head rtort_free;
	int rtort_mbtest;
};

static int fullstop = 0;	/* stop generating callbacks at test end. */
static LIST_HEAD(rcu_torture_freelist);
static struct rcu_torture *rcu_torture_current = NULL;
static long rcu_torture_current_version = 0;
static struct rcu_torture rcu_tortures[10 * RCU_TORTURE_PIPE_LEN];
static DEFINE_SPINLOCK(rcu_torture_lock);
static DEFINE_PER_CPU(long [RCU_TORTURE_PIPE_LEN + 1], rcu_torture_count) =
	{ 0 };
static DEFINE_PER_CPU(long [RCU_TORTURE_PIPE_LEN + 1], rcu_torture_batch) =
	{ 0 };
static atomic_t rcu_torture_wcount[RCU_TORTURE_PIPE_LEN + 1];
static atomic_t n_rcu_torture_alloc;
static atomic_t n_rcu_torture_alloc_fail;
static atomic_t n_rcu_torture_free;
static atomic_t n_rcu_torture_mberror;
static atomic_t n_rcu_torture_error;
static long n_rcu_torture_timers = 0;
static struct list_head rcu_torture_removed;

static int stutter_pause_test = 0;

#if defined(MODULE) || defined(CONFIG_RCU_TORTURE_TEST_RUNNABLE)
#define RCUTORTURE_RUNNABLE_INIT 1
#else
#define RCUTORTURE_RUNNABLE_INIT 0
#endif
int rcutorture_runnable = RCUTORTURE_RUNNABLE_INIT;

/*
 * Allocate an element from the rcu_tortures pool.
 */
static struct rcu_torture *
rcu_torture_alloc(void)
{
	struct list_head *p;

	spin_lock_bh(&rcu_torture_lock);
	if (list_empty(&rcu_torture_freelist)) {
		atomic_inc(&n_rcu_torture_alloc_fail);
		spin_unlock_bh(&rcu_torture_lock);
		return NULL;
	}
	atomic_inc(&n_rcu_torture_alloc);
	p = rcu_torture_freelist.next;
	list_del_init(p);
	spin_unlock_bh(&rcu_torture_lock);
	return container_of(p, struct rcu_torture, rtort_free);
}

/*
 * Free an element to the rcu_tortures pool.
 */
static void
rcu_torture_free(struct rcu_torture *p)
{
	atomic_inc(&n_rcu_torture_free);
	spin_lock_bh(&rcu_torture_lock);
	list_add_tail(&p->rtort_free, &rcu_torture_freelist);
	spin_unlock_bh(&rcu_torture_lock);
}

struct rcu_random_state {
	unsigned long rrs_state;
	long rrs_count;
};

#define RCU_RANDOM_MULT 39916801  /* prime */
#define RCU_RANDOM_ADD	479001701 /* prime */
#define RCU_RANDOM_REFRESH 10000

#define DEFINE_RCU_RANDOM(name) struct rcu_random_state name = { 0, 0 }

/*
 * Crude but fast random-number generator.  Uses a linear congruential
 * generator, with occasional help from cpu_clock().
 */
static unsigned long
rcu_random(struct rcu_random_state *rrsp)
{
	if (--rrsp->rrs_count < 0) {
		rrsp->rrs_state +=
			(unsigned long)cpu_clock(raw_smp_processor_id());
		rrsp->rrs_count = RCU_RANDOM_REFRESH;
	}
	rrsp->rrs_state = rrsp->rrs_state * RCU_RANDOM_MULT + RCU_RANDOM_ADD;
	return swahw32(rrsp->rrs_state);
}

static void
rcu_stutter_wait(void)
{
	while (stutter_pause_test || !rcutorture_runnable)
		if (rcutorture_runnable)
			schedule_timeout_interruptible(1);
		else
			schedule_timeout_interruptible(round_jiffies_relative(HZ));
}

/*
 * Operations vector for selecting different types of tests.
 */

struct rcu_torture_ops {
	void (*init)(void);
	void (*cleanup)(void);
	int (*readlock)(void);
	void (*readdelay)(struct rcu_random_state *rrsp);
	void (*readunlock)(int idx);
	int (*completed)(void);
	void (*deferredfree)(struct rcu_torture *p);
	void (*sync)(void);
	void (*cb_barrier)(void);
	int (*stats)(char *page);
	int irqcapable;
	char *name;
};
static struct rcu_torture_ops *cur_ops = NULL;

/*
 * Definitions for rcu torture testing.
 */

static int rcu_torture_read_lock(void) __acquires(RCU)
{
	rcu_read_lock();
	return 0;
}

static void rcu_read_delay(struct rcu_random_state *rrsp)
{
	long delay;
	const long longdelay = 200;

	/* We want there to be long-running readers, but not all the time. */

	delay = rcu_random(rrsp) % (nrealreaders * 2 * longdelay);
	if (!delay)
		udelay(longdelay);
}

static void rcu_torture_read_unlock(int idx) __releases(RCU)
{
	rcu_read_unlock();
}

static int rcu_torture_completed(void)
{
	return rcu_batches_completed();
}

static void
rcu_torture_cb(struct rcu_head *p)
{
	int i;
	struct rcu_torture *rp = container_of(p, struct rcu_torture, rtort_rcu);

	if (fullstop) {
		/* Test is ending, just drop callbacks on the floor. */
		/* The next initialization will pick up the pieces. */
		return;
	}
	i = rp->rtort_pipe_count;
	if (i > RCU_TORTURE_PIPE_LEN)
		i = RCU_TORTURE_PIPE_LEN;
	atomic_inc(&rcu_torture_wcount[i]);
	if (++rp->rtort_pipe_count >= RCU_TORTURE_PIPE_LEN) {
		rp->rtort_mbtest = 0;
		rcu_torture_free(rp);
	} else
		cur_ops->deferredfree(rp);
}

static void rcu_torture_deferred_free(struct rcu_torture *p)
{
	call_rcu(&p->rtort_rcu, rcu_torture_cb);
}

static struct rcu_torture_ops rcu_ops = {
	.init = NULL,
	.cleanup = NULL,
	.readlock = rcu_torture_read_lock,
	.readdelay = rcu_read_delay,
	.readunlock = rcu_torture_read_unlock,
	.completed = rcu_torture_completed,
	.deferredfree = rcu_torture_deferred_free,
	.sync = synchronize_rcu,
	.cb_barrier = rcu_barrier,
	.stats = NULL,
	.irqcapable = 1,
	.name = "rcu"
};

static void rcu_sync_torture_deferred_free(struct rcu_torture *p)
{
	int i;
	struct rcu_torture *rp;
	struct rcu_torture *rp1;

	cur_ops->sync();
	list_add(&p->rtort_free, &rcu_torture_removed);
	list_for_each_entry_safe(rp, rp1, &rcu_torture_removed, rtort_free) {
		i = rp->rtort_pipe_count;
		if (i > RCU_TORTURE_PIPE_LEN)
			i = RCU_TORTURE_PIPE_LEN;
		atomic_inc(&rcu_torture_wcount[i]);
		if (++rp->rtort_pipe_count >= RCU_TORTURE_PIPE_LEN) {
			rp->rtort_mbtest = 0;
			list_del(&rp->rtort_free);
			rcu_torture_free(rp);
		}
	}
}

static void rcu_sync_torture_init(void)
{
	INIT_LIST_HEAD(&rcu_torture_removed);
}

static struct rcu_torture_ops rcu_sync_ops = {
	.init = rcu_sync_torture_init,
	.cleanup = NULL,
	.readlock = rcu_torture_read_lock,
	.readdelay = rcu_read_delay,
	.readunlock = rcu_torture_read_unlock,
	.completed = rcu_torture_completed,
	.deferredfree = rcu_sync_torture_deferred_free,
	.sync = synchronize_rcu,
	.cb_barrier = NULL,
	.stats = NULL,
	.irqcapable = 1,
	.name = "rcu_sync"
};

/*
 * Definitions for rcu_bh torture testing.
 */

static int rcu_bh_torture_read_lock(void) __acquires(RCU_BH)
{
	rcu_read_lock_bh();
	return 0;
}

static void rcu_bh_torture_read_unlock(int idx) __releases(RCU_BH)
{
	rcu_read_unlock_bh();
}

static int rcu_bh_torture_completed(void)
{
	return rcu_batches_completed_bh();
}

static void rcu_bh_torture_deferred_free(struct rcu_torture *p)
{
	call_rcu_bh(&p->rtort_rcu, rcu_torture_cb);
}

struct rcu_bh_torture_synchronize {
	struct rcu_head head;
	struct completion completion;
};

static void rcu_bh_torture_wakeme_after_cb(struct rcu_head *head)
{
	struct rcu_bh_torture_synchronize *rcu;

	rcu = container_of(head, struct rcu_bh_torture_synchronize, head);
	complete(&rcu->completion);
}

static void rcu_bh_torture_synchronize(void)
{
	struct rcu_bh_torture_synchronize rcu;

	init_completion(&rcu.completion);
	call_rcu_bh(&rcu.head, rcu_bh_torture_wakeme_after_cb);
	wait_for_completion(&rcu.completion);
}

static struct rcu_torture_ops rcu_bh_ops = {
	.init = NULL,
	.cleanup = NULL,
	.readlock = rcu_bh_torture_read_lock,
	.readdelay = rcu_read_delay,  /* just reuse rcu's version. */
	.readunlock = rcu_bh_torture_read_unlock,
	.completed = rcu_bh_torture_completed,
	.deferredfree = rcu_bh_torture_deferred_free,
	.sync = rcu_bh_torture_synchronize,
	.cb_barrier = rcu_barrier_bh,
	.stats = NULL,
	.irqcapable = 1,
	.name = "rcu_bh"
};

static struct rcu_torture_ops rcu_bh_sync_ops = {
	.init = rcu_sync_torture_init,
	.cleanup = NULL,
	.readlock = rcu_bh_torture_read_lock,
	.readdelay = rcu_read_delay,  /* just reuse rcu's version. */
	.readunlock = rcu_bh_torture_read_unlock,
	.completed = rcu_bh_torture_completed,
	.deferredfree = rcu_sync_torture_deferred_free,
	.sync = rcu_bh_torture_synchronize,
	.cb_barrier = NULL,
	.stats = NULL,
	.irqcapable = 1,
	.name = "rcu_bh_sync"
};

/*
 * Definitions for srcu torture testing.
 */

static struct srcu_struct srcu_ctl;

static void srcu_torture_init(void)
{
	init_srcu_struct(&srcu_ctl);
	rcu_sync_torture_init();
}

static void srcu_torture_cleanup(void)
{
	synchronize_srcu(&srcu_ctl);
	cleanup_srcu_struct(&srcu_ctl);
}

static int srcu_torture_read_lock(void) __acquires(&srcu_ctl)
{
	return srcu_read_lock(&srcu_ctl);
}

static void srcu_read_delay(struct rcu_random_state *rrsp)
{
	long delay;
	const long uspertick = 1000000 / HZ;
	const long longdelay = 10;

	/* We want there to be long-running readers, but not all the time. */

	delay = rcu_random(rrsp) % (nrealreaders * 2 * longdelay * uspertick);
	if (!delay)
		schedule_timeout_interruptible(longdelay);
}

static void srcu_torture_read_unlock(int idx) __releases(&srcu_ctl)
{
	srcu_read_unlock(&srcu_ctl, idx);
}

static int srcu_torture_completed(void)
{
	return srcu_batches_completed(&srcu_ctl);
}

static void srcu_torture_synchronize(void)
{
	synchronize_srcu(&srcu_ctl);
}

static int srcu_torture_stats(char *page)
{
	int cnt = 0;
	int cpu;
	int idx = srcu_ctl.completed & 0x1;

	cnt += sprintf(&page[cnt], "%s%s per-CPU(idx=%d):",
		       torture_type, TORTURE_FLAG, idx);
	for_each_possible_cpu(cpu) {
		cnt += sprintf(&page[cnt], " %d(%d,%d)", cpu,
			       per_cpu_ptr(srcu_ctl.per_cpu_ref, cpu)->c[!idx],
			       per_cpu_ptr(srcu_ctl.per_cpu_ref, cpu)->c[idx]);
	}
	cnt += sprintf(&page[cnt], "\n");
	return cnt;
}

static struct rcu_torture_ops srcu_ops = {
	.init = srcu_torture_init,
	.cleanup = srcu_torture_cleanup,
	.readlock = srcu_torture_read_lock,
	.readdelay = srcu_read_delay,
	.readunlock = srcu_torture_read_unlock,
	.completed = srcu_torture_completed,
	.deferredfree = rcu_sync_torture_deferred_free,
	.sync = srcu_torture_synchronize,
	.cb_barrier = NULL,
	.stats = srcu_torture_stats,
	.name = "srcu"
};

/*
 * Definitions for sched torture testing.
 */

static int sched_torture_read_lock(void)
{
	preempt_disable();
	return 0;
}

static void sched_torture_read_unlock(int idx)
{
	preempt_enable();
}

static int sched_torture_completed(void)
{
	return 0;
}

static void rcu_sched_torture_deferred_free(struct rcu_torture *p)
{
	call_rcu_sched(&p->rtort_rcu, rcu_torture_cb);
}

static void sched_torture_synchronize(void)
{
	synchronize_sched();
}

static struct rcu_torture_ops sched_ops = {
	.init = rcu_sync_torture_init,
	.cleanup = NULL,
	.readlock = sched_torture_read_lock,
	.readdelay = rcu_read_delay,  /* just reuse rcu's version. */
	.readunlock = sched_torture_read_unlock,
	.completed = sched_torture_completed,
	.deferredfree = rcu_sched_torture_deferred_free,
	.sync = sched_torture_synchronize,
	.cb_barrier = rcu_barrier_sched,
	.stats = NULL,
	.irqcapable = 1,
	.name = "sched"
};

static struct rcu_torture_ops sched_ops_sync = {
	.init = rcu_sync_torture_init,
	.cleanup = NULL,
	.readlock = sched_torture_read_lock,
	.readdelay = rcu_read_delay,  /* just reuse rcu's version. */
	.readunlock = sched_torture_read_unlock,
	.completed = sched_torture_completed,
	.deferredfree = rcu_sync_torture_deferred_free,
	.sync = sched_torture_synchronize,
	.cb_barrier = NULL,
	.stats = NULL,
	.name = "sched_sync"
};

/*
 * RCU torture writer kthread.  Repeatedly substitutes a new structure
 * for that pointed to by rcu_torture_current, freeing the old structure
 * after a series of grace periods (the "pipeline").
 */
static int
rcu_torture_writer(void *arg)
{
	int i;
	long oldbatch = rcu_batches_completed();
	struct rcu_torture *rp;
	struct rcu_torture *old_rp;
	static DEFINE_RCU_RANDOM(rand);

	VERBOSE_PRINTK_STRING("rcu_torture_writer task started");
	set_user_nice(current, 19);

	do {
		schedule_timeout_uninterruptible(1);
		if ((rp = rcu_torture_alloc()) == NULL)
			continue;
		rp->rtort_pipe_count = 0;
		udelay(rcu_random(&rand) & 0x3ff);
		old_rp = rcu_torture_current;
		rp->rtort_mbtest = 1;
		rcu_assign_pointer(rcu_torture_current, rp);
		smp_wmb();
		if (old_rp) {
			i = old_rp->rtort_pipe_count;
			if (i > RCU_TORTURE_PIPE_LEN)
				i = RCU_TORTURE_PIPE_LEN;
			atomic_inc(&rcu_torture_wcount[i]);
			old_rp->rtort_pipe_count++;
			cur_ops->deferredfree(old_rp);
		}
		rcu_torture_current_version++;
		oldbatch = cur_ops->completed();
		rcu_stutter_wait();
	} while (!kthread_should_stop() && !fullstop);
	VERBOSE_PRINTK_STRING("rcu_torture_writer task stopping");
	while (!kthread_should_stop())
		schedule_timeout_uninterruptible(1);
	return 0;
}

/*
 * RCU torture fake writer kthread.  Repeatedly calls sync, with a random
 * delay between calls.
 */
static int
rcu_torture_fakewriter(void *arg)
{
	DEFINE_RCU_RANDOM(rand);

	VERBOSE_PRINTK_STRING("rcu_torture_fakewriter task started");
	set_user_nice(current, 19);

	do {
		schedule_timeout_uninterruptible(1 + rcu_random(&rand)%10);
		udelay(rcu_random(&rand) & 0x3ff);
		cur_ops->sync();
		rcu_stutter_wait();
	} while (!kthread_should_stop() && !fullstop);

	VERBOSE_PRINTK_STRING("rcu_torture_fakewriter task stopping");
	while (!kthread_should_stop())
		schedule_timeout_uninterruptible(1);
	return 0;
}

/*
 * RCU torture reader from timer handler.  Dereferences rcu_torture_current,
 * incrementing the corresponding element of the pipeline array.  The
 * counter in the element should never be greater than 1, otherwise, the
 * RCU implementation is broken.
 */
static void rcu_torture_timer(unsigned long unused)
{
	int idx;
	int completed;
	static DEFINE_RCU_RANDOM(rand);
	static DEFINE_SPINLOCK(rand_lock);
	struct rcu_torture *p;
	int pipe_count;

	idx = cur_ops->readlock();
	completed = cur_ops->completed();
	p = rcu_dereference(rcu_torture_current);
	if (p == NULL) {
		/* Leave because rcu_torture_writer is not yet underway */
		cur_ops->readunlock(idx);
		return;
	}
	if (p->rtort_mbtest == 0)
		atomic_inc(&n_rcu_torture_mberror);
	spin_lock(&rand_lock);
	cur_ops->readdelay(&rand);
	n_rcu_torture_timers++;
	spin_unlock(&rand_lock);
	preempt_disable();
	pipe_count = p->rtort_pipe_count;
	if (pipe_count > RCU_TORTURE_PIPE_LEN) {
		/* Should not happen, but... */
		pipe_count = RCU_TORTURE_PIPE_LEN;
	}
	++__get_cpu_var(rcu_torture_count)[pipe_count];
	completed = cur_ops->completed() - completed;
	if (completed > RCU_TORTURE_PIPE_LEN) {
		/* Should not happen, but... */
		completed = RCU_TORTURE_PIPE_LEN;
	}
	++__get_cpu_var(rcu_torture_batch)[completed];
	preempt_enable();
	cur_ops->readunlock(idx);
}

/*
 * RCU torture reader kthread.  Repeatedly dereferences rcu_torture_current,
 * incrementing the corresponding element of the pipeline array.  The
 * counter in the element should never be greater than 1, otherwise, the
 * RCU implementation is broken.
 */
static int
rcu_torture_reader(void *arg)
{
	int completed;
	int idx;
	DEFINE_RCU_RANDOM(rand);
	struct rcu_torture *p;
	int pipe_count;
	struct timer_list t;

	VERBOSE_PRINTK_STRING("rcu_torture_reader task started");
	set_user_nice(current, 19);
	if (irqreader && cur_ops->irqcapable)
		setup_timer_on_stack(&t, rcu_torture_timer, 0);

	do {
		if (irqreader && cur_ops->irqcapable) {
			if (!timer_pending(&t))
				mod_timer(&t, 1);
		}
		idx = cur_ops->readlock();
		completed = cur_ops->completed();
		p = rcu_dereference(rcu_torture_current);
		if (p == NULL) {
			/* Wait for rcu_torture_writer to get underway */
			cur_ops->readunlock(idx);
			schedule_timeout_interruptible(HZ);
			continue;
		}
		if (p->rtort_mbtest == 0)
			atomic_inc(&n_rcu_torture_mberror);
		cur_ops->readdelay(&rand);
		preempt_disable();
		pipe_count = p->rtort_pipe_count;
		if (pipe_count > RCU_TORTURE_PIPE_LEN) {
			/* Should not happen, but... */
			pipe_count = RCU_TORTURE_PIPE_LEN;
		}
		++__get_cpu_var(rcu_torture_count)[pipe_count];
		completed = cur_ops->completed() - completed;
		if (completed > RCU_TORTURE_PIPE_LEN) {
			/* Should not happen, but... */
			completed = RCU_TORTURE_PIPE_LEN;
		}
		++__get_cpu_var(rcu_torture_batch)[completed];
		preempt_enable();
		cur_ops->readunlock(idx);
		schedule();
		rcu_stutter_wait();
	} while (!kthread_should_stop() && !fullstop);
	VERBOSE_PRINTK_STRING("rcu_torture_reader task stopping");
	if (irqreader && cur_ops->irqcapable)
		del_timer_sync(&t);
	while (!kthread_should_stop())
		schedule_timeout_uninterruptible(1);
	return 0;
}

/*
 * Create an RCU-torture statistics message in the specified buffer.
 */
static int
rcu_torture_printk(char *page)
{
	int cnt = 0;
	int cpu;
	int i;
	long pipesummary[RCU_TORTURE_PIPE_LEN + 1] = { 0 };
	long batchsummary[RCU_TORTURE_PIPE_LEN + 1] = { 0 };

	for_each_possible_cpu(cpu) {
		for (i = 0; i < RCU_TORTURE_PIPE_LEN + 1; i++) {
			pipesummary[i] += per_cpu(rcu_torture_count, cpu)[i];
			batchsummary[i] += per_cpu(rcu_torture_batch, cpu)[i];
		}
	}
	for (i = RCU_TORTURE_PIPE_LEN - 1; i >= 0; i--) {
		if (pipesummary[i] != 0)
			break;
	}
	cnt += sprintf(&page[cnt], "%s%s ", torture_type, TORTURE_FLAG);
	cnt += sprintf(&page[cnt],
		       "rtc: %p ver: %ld tfle: %d rta: %d rtaf: %d rtf: %d "
		       "rtmbe: %d nt: %ld",
		       rcu_torture_current,
		       rcu_torture_current_version,
		       list_empty(&rcu_torture_freelist),
		       atomic_read(&n_rcu_torture_alloc),
		       atomic_read(&n_rcu_torture_alloc_fail),
		       atomic_read(&n_rcu_torture_free),
		       atomic_read(&n_rcu_torture_mberror),
		       n_rcu_torture_timers);
	if (atomic_read(&n_rcu_torture_mberror) != 0)
		cnt += sprintf(&page[cnt], " !!!");
	cnt += sprintf(&page[cnt], "\n%s%s ", torture_type, TORTURE_FLAG);
	if (i > 1) {
		cnt += sprintf(&page[cnt], "!!! ");
		atomic_inc(&n_rcu_torture_error);
		WARN_ON_ONCE(1);
	}
	cnt += sprintf(&page[cnt], "Reader Pipe: ");
	for (i = 0; i < RCU_TORTURE_PIPE_LEN + 1; i++)
		cnt += sprintf(&page[cnt], " %ld", pipesummary[i]);
	cnt += sprintf(&page[cnt], "\n%s%s ", torture_type, TORTURE_FLAG);
	cnt += sprintf(&page[cnt], "Reader Batch: ");
	for (i = 0; i < RCU_TORTURE_PIPE_LEN + 1; i++)
		cnt += sprintf(&page[cnt], " %ld", batchsummary[i]);
	cnt += sprintf(&page[cnt], "\n%s%s ", torture_type, TORTURE_FLAG);
	cnt += sprintf(&page[cnt], "Free-Block Circulation: ");
	for (i = 0; i < RCU_TORTURE_PIPE_LEN + 1; i++) {
		cnt += sprintf(&page[cnt], " %d",
			       atomic_read(&rcu_torture_wcount[i]));
	}
	cnt += sprintf(&page[cnt], "\n");
	if (cur_ops->stats)
		cnt += cur_ops->stats(&page[cnt]);
	return cnt;
}

/*
 * Print torture statistics.  Caller must ensure that there is only
 * one call to this function at a given time!!!  This is normally
 * accomplished by relying on the module system to only have one copy
 * of the module loaded, and then by giving the rcu_torture_stats
 * kthread full control (or the init/cleanup functions when rcu_torture_stats
 * thread is not running).
 */
static void
rcu_torture_stats_print(void)
{
	int cnt;

	cnt = rcu_torture_printk(printk_buf);
	printk(KERN_ALERT "%s", printk_buf);
}

/*
 * Periodically prints torture statistics, if periodic statistics printing
 * was specified via the stat_interval module parameter.
 *
 * No need to worry about fullstop here, since this one doesn't reference
 * volatile state or register callbacks.
 */
static int
rcu_torture_stats(void *arg)
{
	VERBOSE_PRINTK_STRING("rcu_torture_stats task started");
	do {
		schedule_timeout_interruptible(stat_interval * HZ);
		rcu_torture_stats_print();
	} while (!kthread_should_stop());
	VERBOSE_PRINTK_STRING("rcu_torture_stats task stopping");
	return 0;
}

static int rcu_idle_cpu;	/* Force all torture tasks off this CPU */

/* Shuffle tasks such that we allow @rcu_idle_cpu to become idle. A special case
 * is when @rcu_idle_cpu = -1, when we allow the tasks to run on all CPUs.
 */
static void rcu_torture_shuffle_tasks(void)
{
	cpumask_t tmp_mask;
	int i;

	cpus_setall(tmp_mask);
	get_online_cpus();

	/* No point in shuffling if there is only one online CPU (ex: UP) */
	if (num_online_cpus() == 1) {
		put_online_cpus();
		return;
	}

	if (rcu_idle_cpu != -1)
		cpu_clear(rcu_idle_cpu, tmp_mask);

	set_cpus_allowed_ptr(current, &tmp_mask);

	if (reader_tasks) {
		for (i = 0; i < nrealreaders; i++)
			if (reader_tasks[i])
				set_cpus_allowed_ptr(reader_tasks[i],
						     &tmp_mask);
	}

	if (fakewriter_tasks) {
		for (i = 0; i < nfakewriters; i++)
			if (fakewriter_tasks[i])
				set_cpus_allowed_ptr(fakewriter_tasks[i],
						     &tmp_mask);
	}

	if (writer_task)
		set_cpus_allowed_ptr(writer_task, &tmp_mask);

	if (stats_task)
		set_cpus_allowed_ptr(stats_task, &tmp_mask);

	if (rcu_idle_cpu == -1)
		rcu_idle_cpu = num_online_cpus() - 1;
	else
		rcu_idle_cpu--;

	put_online_cpus();
}

/* Shuffle tasks across CPUs, with the intent of allowing each CPU in the
 * system to become idle at a time and cut off its timer ticks. This is meant
 * to test the support for such tickless idle CPU in RCU.
 */
static int
rcu_torture_shuffle(void *arg)
{
	VERBOSE_PRINTK_STRING("rcu_torture_shuffle task started");
	do {
		schedule_timeout_interruptible(shuffle_interval * HZ);
		rcu_torture_shuffle_tasks();
	} while (!kthread_should_stop());
	VERBOSE_PRINTK_STRING("rcu_torture_shuffle task stopping");
	return 0;
}

/* Cause the rcutorture test to "stutter", starting and stopping all
 * threads periodically.
 */
static int
rcu_torture_stutter(void *arg)
{
	VERBOSE_PRINTK_STRING("rcu_torture_stutter task started");
	do {
		schedule_timeout_interruptible(stutter * HZ);
		stutter_pause_test = 1;
		if (!kthread_should_stop())
			schedule_timeout_interruptible(stutter * HZ);
		stutter_pause_test = 0;
	} while (!kthread_should_stop());
	VERBOSE_PRINTK_STRING("rcu_torture_stutter task stopping");
	return 0;
}

static inline void
rcu_torture_print_module_parms(char *tag)
{
	printk(KERN_ALERT "%s" TORTURE_FLAG
		"--- %s: nreaders=%d nfakewriters=%d "
		"stat_interval=%d verbose=%d test_no_idle_hz=%d "
		"shuffle_interval=%d stutter=%d irqreader=%d\n",
		torture_type, tag, nrealreaders, nfakewriters,
		stat_interval, verbose, test_no_idle_hz, shuffle_interval,
		stutter, irqreader);
}

static void
rcu_torture_cleanup(void)
{
	int i;

	fullstop = 1;
	if (stutter_task) {
		VERBOSE_PRINTK_STRING("Stopping rcu_torture_stutter task");
		kthread_stop(stutter_task);
	}
	stutter_task = NULL;
	if (shuffler_task) {
		VERBOSE_PRINTK_STRING("Stopping rcu_torture_shuffle task");
		kthread_stop(shuffler_task);
	}
	shuffler_task = NULL;

	if (writer_task) {
		VERBOSE_PRINTK_STRING("Stopping rcu_torture_writer task");
		kthread_stop(writer_task);
	}
	writer_task = NULL;

	if (reader_tasks) {
		for (i = 0; i < nrealreaders; i++) {
			if (reader_tasks[i]) {
				VERBOSE_PRINTK_STRING(
					"Stopping rcu_torture_reader task");
				kthread_stop(reader_tasks[i]);
			}
			reader_tasks[i] = NULL;
		}
		kfree(reader_tasks);
		reader_tasks = NULL;
	}
	rcu_torture_current = NULL;

	if (fakewriter_tasks) {
		for (i = 0; i < nfakewriters; i++) {
			if (fakewriter_tasks[i]) {
				VERBOSE_PRINTK_STRING(
					"Stopping rcu_torture_fakewriter task");
				kthread_stop(fakewriter_tasks[i]);
			}
			fakewriter_tasks[i] = NULL;
		}
		kfree(fakewriter_tasks);
		fakewriter_tasks = NULL;
	}

	if (stats_task) {
		VERBOSE_PRINTK_STRING("Stopping rcu_torture_stats task");
		kthread_stop(stats_task);
	}
	stats_task = NULL;

	/* Wait for all RCU callbacks to fire.  */

	if (cur_ops->cb_barrier != NULL)
		cur_ops->cb_barrier();

	rcu_torture_stats_print();  /* -After- the stats thread is stopped! */

	if (cur_ops->cleanup)
		cur_ops->cleanup();
	if (atomic_read(&n_rcu_torture_error))
		rcu_torture_print_module_parms("End of test: FAILURE");
	else
		rcu_torture_print_module_parms("End of test: SUCCESS");
}

static int __init
rcu_torture_init(void)
{
	int i;
	int cpu;
	int firsterr = 0;
	static struct rcu_torture_ops *torture_ops[] =
		{ &rcu_ops, &rcu_sync_ops, &rcu_bh_ops, &rcu_bh_sync_ops,
		  &srcu_ops, &sched_ops, &sched_ops_sync, };

	/* Process args and tell the world that the torturer is on the job. */
	for (i = 0; i < ARRAY_SIZE(torture_ops); i++) {
		cur_ops = torture_ops[i];
		if (strcmp(torture_type, cur_ops->name) == 0)
			break;
	}
	if (i == ARRAY_SIZE(torture_ops)) {
		printk(KERN_ALERT "rcutorture: invalid torture type: \"%s\"\n",
		       torture_type);
		return (-EINVAL);
	}
	if (cur_ops->init)
		cur_ops->init(); /* no "goto unwind" prior to this point!!! */

	if (nreaders >= 0)
		nrealreaders = nreaders;
	else
		nrealreaders = 2 * num_online_cpus();
	rcu_torture_print_module_parms("Start of test");
	fullstop = 0;

	/* Set up the freelist. */

	INIT_LIST_HEAD(&rcu_torture_freelist);
	for (i = 0; i < ARRAY_SIZE(rcu_tortures); i++) {
		rcu_tortures[i].rtort_mbtest = 0;
		list_add_tail(&rcu_tortures[i].rtort_free,
			      &rcu_torture_freelist);
	}

	/* Initialize the statistics so that each run gets its own numbers. */

	rcu_torture_current = NULL;
	rcu_torture_current_version = 0;
	atomic_set(&n_rcu_torture_alloc, 0);
	atomic_set(&n_rcu_torture_alloc_fail, 0);
	atomic_set(&n_rcu_torture_free, 0);
	atomic_set(&n_rcu_torture_mberror, 0);
	atomic_set(&n_rcu_torture_error, 0);
	for (i = 0; i < RCU_TORTURE_PIPE_LEN + 1; i++)
		atomic_set(&rcu_torture_wcount[i], 0);
	for_each_possible_cpu(cpu) {
		for (i = 0; i < RCU_TORTURE_PIPE_LEN + 1; i++) {
			per_cpu(rcu_torture_count, cpu)[i] = 0;
			per_cpu(rcu_torture_batch, cpu)[i] = 0;
		}
	}

	/* Start up the kthreads. */

	VERBOSE_PRINTK_STRING("Creating rcu_torture_writer task");
	writer_task = kthread_run(rcu_torture_writer, NULL,
				  "rcu_torture_writer");
	if (IS_ERR(writer_task)) {
		firsterr = PTR_ERR(writer_task);
		VERBOSE_PRINTK_ERRSTRING("Failed to create writer");
		writer_task = NULL;
		goto unwind;
	}
	fakewriter_tasks = kzalloc(nfakewriters * sizeof(fakewriter_tasks[0]),
	                           GFP_KERNEL);
	if (fakewriter_tasks == NULL) {
		VERBOSE_PRINTK_ERRSTRING("out of memory");
		firsterr = -ENOMEM;
		goto unwind;
	}
	for (i = 0; i < nfakewriters; i++) {
		VERBOSE_PRINTK_STRING("Creating rcu_torture_fakewriter task");
		fakewriter_tasks[i] = kthread_run(rcu_torture_fakewriter, NULL,
		                                  "rcu_torture_fakewriter");
		if (IS_ERR(fakewriter_tasks[i])) {
			firsterr = PTR_ERR(fakewriter_tasks[i]);
			VERBOSE_PRINTK_ERRSTRING("Failed to create fakewriter");
			fakewriter_tasks[i] = NULL;
			goto unwind;
		}
	}
	reader_tasks = kzalloc(nrealreaders * sizeof(reader_tasks[0]),
			       GFP_KERNEL);
	if (reader_tasks == NULL) {
		VERBOSE_PRINTK_ERRSTRING("out of memory");
		firsterr = -ENOMEM;
		goto unwind;
	}
	for (i = 0; i < nrealreaders; i++) {
		VERBOSE_PRINTK_STRING("Creating rcu_torture_reader task");
		reader_tasks[i] = kthread_run(rcu_torture_reader, NULL,
					      "rcu_torture_reader");
		if (IS_ERR(reader_tasks[i])) {
			firsterr = PTR_ERR(reader_tasks[i]);
			VERBOSE_PRINTK_ERRSTRING("Failed to create reader");
			reader_tasks[i] = NULL;
			goto unwind;
		}
	}
	if (stat_interval > 0) {
		VERBOSE_PRINTK_STRING("Creating rcu_torture_stats task");
		stats_task = kthread_run(rcu_torture_stats, NULL,
					"rcu_torture_stats");
		if (IS_ERR(stats_task)) {
			firsterr = PTR_ERR(stats_task);
			VERBOSE_PRINTK_ERRSTRING("Failed to create stats");
			stats_task = NULL;
			goto unwind;
		}
	}
	if (test_no_idle_hz) {
		rcu_idle_cpu = num_online_cpus() - 1;
		/* Create the shuffler thread */
		shuffler_task = kthread_run(rcu_torture_shuffle, NULL,
					  "rcu_torture_shuffle");
		if (IS_ERR(shuffler_task)) {
			firsterr = PTR_ERR(shuffler_task);
			VERBOSE_PRINTK_ERRSTRING("Failed to create shuffler");
			shuffler_task = NULL;
			goto unwind;
		}
	}
	if (stutter < 0)
		stutter = 0;
	if (stutter) {
		/* Create the stutter thread */
		stutter_task = kthread_run(rcu_torture_stutter, NULL,
					  "rcu_torture_stutter");
		if (IS_ERR(stutter_task)) {
			firsterr = PTR_ERR(stutter_task);
			VERBOSE_PRINTK_ERRSTRING("Failed to create stutter");
			stutter_task = NULL;
			goto unwind;
		}
	}
	return 0;

unwind:
	rcu_torture_cleanup();
	return firsterr;
}

module_init(rcu_torture_init);
module_exit(rcu_torture_cleanup);

^ permalink raw reply	[flat|nested] 27+ messages in thread

* Re: [BUG] NULL pointer deref with rcutorture
  2009-01-06  2:15                                         ` Paul E. McKenney
@ 2009-01-06  7:47                                           ` Eric Sesterhenn
  2009-01-06 12:48                                             ` Paul E. McKenney
  0 siblings, 1 reply; 27+ messages in thread
From: Eric Sesterhenn @ 2009-01-06  7:47 UTC (permalink / raw)
  To: Paul E. McKenney; +Cc: Kamalesh Babulal, linux-kernel, josh, dipankar

* Paul E. McKenney (paulmck@linux.vnet.ibm.com) wrote:
> On Mon, Jan 05, 2009 at 04:29:12PM -0800, Paul E. McKenney wrote:
> > On Mon, Jan 05, 2009 at 02:18:36PM -0800, Paul E. McKenney wrote:
> > > >      e98:	a1 98 00 00 00       	mov    0x98,%eax
> > > > 			e99: R_386_32	.bss
> > > >      e9d:	85 c0                	test   %eax,%eax
> > > >      e9f:	75 09                	jne    eaa <rcu_stutter_wait+0x1a>
> > > >      ea1:	a1 00 00 00 00       	mov    0x0,%eax
> > > > 			ea2: R_386_32	rcutorture_runnable
> > > >      ea6:	85 c0                	test   %eax,%eax
> > > >      ea8:	75 36                	jne    ee0 <rcu_stutter_wait+0x50>
> > > >      eaa:	a1 88 1a 00 00       	mov    0x1a88,%eax
> > > > 			eab: R_386_32	.bss
> > > >      eaf:	85 c0                	test   %eax,%eax
> > > >      eb1:	75 2d                	jne    ee0 <rcu_stutter_wait+0x50>
> > > >      eb3:	8b 15 00 00 00 00    	mov    0x0,%edx
> > > > 			eb5: R_386_32	rcutorture_runnable
> > > >      eb9:	85 d2                	test   %edx,%edx
> > > >      ebb:	74 2b                	je     ee8 <rcu_stutter_wait+0x58>
> > > >      ebd:	b8 01 00 00 00       	mov    $0x1,%eax
> > > >      ec2:	e8 fc ff ff ff       	call   ec3 <rcu_stutter_wait+0x33>
> > > > 			ec3: R_386_PC32	schedule_timeout_interruptible
> > > >      ec7:	a1 98 00 00 00       	mov    0x98,%eax
> > > > 			ec8: R_386_32	.bss
> > > >      ecc:	85 c0                	test   %eax,%eax
> > > >      ece:	74 d1                	je     ea1 <rcu_stutter_wait+0x11>
> > > >      ed0:	a1 88 1a 00 00       	mov    0x1a88,%eax
> > > > 			ed1: R_386_32	.bss
> > > >      ed5:	85 c0                	test   %eax,%eax
> > > >      ed7:	74 da                	je     eb3 <rcu_stutter_wait+0x23>
> > > >      ed9:	8d b4 26 00 00 00 00 	lea    0x0(%esi,%eiz,1),%esi
> > > >      ee0:	5d                   	pop    %ebp
> > > >      ee1:	c3                   	ret    
> > > >      ee2:	8d b6 00 00 00 00    	lea    0x0(%esi),%esi
> > > >      ee8:	b8 fa 00 00 00       	mov    $0xfa,%eax
> > > >      eed:	e8 fc ff ff ff       	call   eee <rcu_stutter_wait+0x5e>
> > > > 			eee: R_386_PC32	round_jiffies_relative
> > > >      ef2:	8d b6 00 00 00 00    	lea    0x0(%esi),%esi
> > > >      ef8:	e8 fc ff ff ff       	call   ef9 <rcu_stutter_wait+0x69>
> > > > 			ef9: R_386_PC32	schedule_timeout_interruptible
> > > >      efd:	8d 76 00             	lea    0x0(%esi),%esi
> > > > 
> > > > here is the deref ------------------------^
> > > 
> > > Ah!!!  We are getting a page fault while cleaning up the stack frame?
> > > 
> > > Ouch!
> > > 
> > > >      f00:	eb 96                	jmp    e98 <rcu_stutter_wait+0x8>
> > > >      f02:	8d b4 26 00 00 00 00 	lea    0x0(%esi,%eiz,1),%esi
> > > >      f09:	8d bc 27 00 00 00 00 	lea    0x0(%edi,%eiz,1),%edi
> > 
> > And now I can run this.  gcc v3.4.4 gives yet a different error:
> > 
> > divide error: 0000 [#1] PREEMPT SMP 
> > last sysfs file: /sys/devices/pci0000:00/0000:00:0a.0/0000:02:04.0/host0/target0:0:6/0:0:6:0/type
> > CPU 1 
> > Modules linked in: [last unloaded: rcutorture]
> > Pid: 3369, comm: rcu_torture_rea Not tainted 2.6.28-autokern1 #1
> > RIP: 0010:[<ffffffffa0000142>]  [<ffffffffa0000142>] 0xffffffffa0000142
> > RSP: 0000:ffff88007f0afeb0  EFLAGS: 00010246
> > RAX: 00000000def36d6a RBX: ffffffffa0006850 RCX: 0000000000000000
> > RDX: 0000000000000000 RSI: ffff88007ecf1600 RDI: ffff88007f0afef0
> > RBP: 0000000000000dd4 R08: ffff88007f0ae000 R09: ffffffff8037c7d9
> > R10: 0000000000000000 R11: ffffffff8037c7d9 R12: 0000000000000000
> > R13: 000000000000000a R14: 0000000000000000 R15: 0000000000000000
> > FS:  0000000000000000(0000) GS:ffff8800e3801c00(0000) knlGS:00000000f7f3ab80
> > CS:  0010 DS: 0018 ES: 0018 CR0: 000000008005003b
> > CR2: 00000000080dc87c CR3: 0000000000201000 CR4: 00000000000006e0
> > DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
> > DR3: 0000000000000000 DR6: 00000000ffff0ff0 DR7: 0000000000000400
> > Process rcu_torture_rea (pid: 3369, threadinfo ffff88007f0ae000, task ffff88007ecf5d80)
> > Stack:
> >  ffffffff8037c7d9 ffffffffa000090d ffff88007ecfdec0 ffff88007ec17eb0
> >  0000000000000001 ffffffffa00006e6 0000000000000000 ffff8800e3488000
> >  c744ca136d6adef3 0000000000000256 0000000000000000 ffffffffa0000807
> > Call Trace:
> >  [<ffffffff8037c7d9>] ? delay_tsc+0x0/0x99
> >  [<ffffffff80245b94>] ? kthread+0x3d/0x63
> >  [<ffffffff8020c3ea>] ? child_rip+0xa/0x20
> >  [<ffffffff80245b57>] ? kthread+0x0/0x63
> >  [<ffffffff8020c3e0>] ? child_rip+0x0/0x20
> > Code: 58 c3 56 bf 01 00 00 00 e8 88 bd 22 e0 59 31 c0 c3 41 51 e8 69 ff ff ff 48 63 15 0a 58 00 00 48 69 d2 90 01 00 00 48 89 d1 31 d2 <48> f7 f1 48 85 d2 75 0c 41 58 bf 78 1b 0d 00 e9 32 c7 37 e0 5f 
> > RIP  [<ffffffffa0000142>] 0xffffffffa0000142
> >  RSP <ffff88007f0afeb0>
> > 
> > I will see what I can find from this.
> 
> Hello, Eric,
> 
> Just for grins, I built the 2.6.28 kernel/rcutorture.c on b58602a.  It
> insmoded and rmmoded without errors.
> 
> Could you please try this on your setup?  I have attached this file in
> case that helps.

works like a charm

Greetings, Eric

^ permalink raw reply	[flat|nested] 27+ messages in thread

* Re: [BUG] NULL pointer deref with rcutorture
  2009-01-06  7:47                                           ` Eric Sesterhenn
@ 2009-01-06 12:48                                             ` Paul E. McKenney
  2009-01-07 19:46                                               ` Paul E. McKenney
  0 siblings, 1 reply; 27+ messages in thread
From: Paul E. McKenney @ 2009-01-06 12:48 UTC (permalink / raw)
  To: Eric Sesterhenn; +Cc: Kamalesh Babulal, linux-kernel, josh, dipankar

On Tue, Jan 06, 2009 at 08:47:07AM +0100, Eric Sesterhenn wrote:
> * Paul E. McKenney (paulmck@linux.vnet.ibm.com) wrote:
> > On Mon, Jan 05, 2009 at 04:29:12PM -0800, Paul E. McKenney wrote:
> > > On Mon, Jan 05, 2009 at 02:18:36PM -0800, Paul E. McKenney wrote:
> > > I will see what I can find from this.
> > 
> > Hello, Eric,
> > 
> > Just for grins, I built the 2.6.28 kernel/rcutorture.c on b58602a.  It
> > insmoded and rmmoded without errors.
> > 
> > Could you please try this on your setup?  I have attached this file in
> > case that helps.
> 
> works like a charm

Well, that certainly tells me where the bug lies.  ;-)

Thank you for all your hard work on this, I should have a patch soon.

							Thanx, Paul

^ permalink raw reply	[flat|nested] 27+ messages in thread

* Re: [BUG] NULL pointer deref with rcutorture
  2009-01-06 12:48                                             ` Paul E. McKenney
@ 2009-01-07 19:46                                               ` Paul E. McKenney
  2009-01-07 20:19                                                 ` Eric Sesterhenn
  0 siblings, 1 reply; 27+ messages in thread
From: Paul E. McKenney @ 2009-01-07 19:46 UTC (permalink / raw)
  To: Eric Sesterhenn; +Cc: Kamalesh Babulal, linux-kernel, josh, dipankar

[-- Attachment #1: Type: text/plain, Size: 9368 bytes --]

On Tue, Jan 06, 2009 at 04:48:09AM -0800, Paul E. McKenney wrote:
> On Tue, Jan 06, 2009 at 08:47:07AM +0100, Eric Sesterhenn wrote:
> > * Paul E. McKenney (paulmck@linux.vnet.ibm.com) wrote:
> > > On Mon, Jan 05, 2009 at 04:29:12PM -0800, Paul E. McKenney wrote:
> > > > On Mon, Jan 05, 2009 at 02:18:36PM -0800, Paul E. McKenney wrote:
> > > > I will see what I can find from this.
> > > 
> > > Hello, Eric,
> > > 
> > > Just for grins, I built the 2.6.28 kernel/rcutorture.c on b58602a.  It
> > > insmoded and rmmoded without errors.
> > > 
> > > Could you please try this on your setup?  I have attached this file in
> > > case that helps.
> > 
> > works like a charm
> 
> Well, that certainly tells me where the bug lies.  ;-)
> 
> Thank you for all your hard work on this, I should have a patch soon.

And here it is, against the original version of rcutorture.c (revert
all of my previous patches against rcutorture.c).  I am also attaching
the resulting rcutorture.c, if that makes things more convenient for you.

This one survives modprobe and rmmod, and also survives when shutdown
while running.  Here are the operating modes:

CONFIG_RCU_TORTURE_TEST=m

	This is the normal rcutorture build.  Use "modprobe rcutorture"
	(with optional arguments) to start, and "rmmod rcutorture" to
	stop.  If you shut the system down without doing the rmmod, you
	should see console output like:

	rcutorture thread rcu_torture_writer parking due to system shutdown

	One for each rcutorture kthread.

CONFIG_RCU_TORTURE_TEST=y
CONFIG_RCU_TORTURE_TEST_RUNNABLE=n

	Use this if you want rcutorture built in, but don't want the
	test to start running during early boot.  To start the
	torturing:

		echo 1 > /proc/sys/kernel/rcutorture_runnable

	To stop the torturing, s/1/0/

	You will get "parking" console messages as noted above when
	you shut the system down.

CONFIG_RCU_TORTURE_TEST=y
CONFIG_RCU_TORTURE_TEST_RUNNABLE=y

	Same as above, except that the torturing starts during early
	boot.  Only for the stout of heart and strong of stomach.
	The same /proc entry noted above may be used to control the
	test.

Could you please fire this up and see if it fixes the rmmod-time bug
that you located?

						Thanx, Paul

Signed-off-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
---

 rcutorture.c |   94 +++++++++++++++++++++++++++++++++++++----------------------
 1 file changed, 59 insertions(+), 35 deletions(-)

diff --git a/kernel/rcutorture.c b/kernel/rcutorture.c
index b310655..7c4142a 100644
--- a/kernel/rcutorture.c
+++ b/kernel/rcutorture.c
@@ -136,31 +136,47 @@ static int stutter_pause_test = 0;
 #endif
 int rcutorture_runnable = RCUTORTURE_RUNNABLE_INIT;
 
-#define FULLSTOP_SIGNALED 1	/* Bail due to signal. */
-#define FULLSTOP_CLEANUP  2	/* Orderly shutdown. */
-static int fullstop;		/* stop generating callbacks at test end. */
-DEFINE_MUTEX(fullstop_mutex);	/* protect fullstop transitions and */
-				/*  spawning of kthreads. */
+/* Mediate rmmod and system shutdown.  Concurrent rmmod & shutdown illegal! */
+
+#define FULLSTOP_DONTSTOP 0	/* Normal operation. */
+#define FULLSTOP_SHUTDOWN 1	/* System shutdown with rcutorture running. */
+#define FULLSTOP_RMMOD    2	/* Normal rmmod of rcutorture. */
+static int fullstop = FULLSTOP_RMMOD;
+DEFINE_MUTEX(fullstop_mutex);	/* Protect fullstop transitions and spawning */
+				/*  of kthreads. */
 
 /*
- * Detect and respond to a signal-based shutdown.
+ * Detect and respond to a system shutdown.
  */
 static int
 rcutorture_shutdown_notify(struct notifier_block *unused1,
 			   unsigned long unused2, void *unused3)
 {
-	if (fullstop)
-		return NOTIFY_DONE;
-	if (signal_pending(current)) {
-		mutex_lock(&fullstop_mutex);
-		if (!ACCESS_ONCE(fullstop))
-			fullstop = FULLSTOP_SIGNALED;
-		mutex_unlock(&fullstop_mutex);
-	}
+	mutex_lock(&fullstop_mutex);
+	if (fullstop == FULLSTOP_DONTSTOP)
+		fullstop = FULLSTOP_SHUTDOWN;
+	else
+		printk(KERN_WARNING /* but going down anyway, so... */
+		       "Concurrent 'rmmod rcutorture' and shutdown illegal!\n");
+	mutex_unlock(&fullstop_mutex);
 	return NOTIFY_DONE;
 }
 
 /*
+ * Absorb kthreads into a kernel function that won't return, so that
+ * they won't ever access module text or data again.
+ */
+static void rcutorture_shutdown_absorb(char *title)
+{
+	if (ACCESS_ONCE(fullstop) == FULLSTOP_SHUTDOWN) {
+		printk(KERN_NOTICE
+		       "rcutorture thread %s parking due to system shutdown\n",
+		       title);
+		schedule_timeout_uninterruptible(MAX_SCHEDULE_TIMEOUT);
+	}
+}
+
+/*
  * Allocate an element from the rcu_tortures pool.
  */
 static struct rcu_torture *
@@ -221,13 +237,14 @@ rcu_random(struct rcu_random_state *rrsp)
 }
 
 static void
-rcu_stutter_wait(void)
+rcu_stutter_wait(char *title)
 {
-	while ((stutter_pause_test || !rcutorture_runnable) && !fullstop) {
+	while (stutter_pause_test || !rcutorture_runnable) {
 		if (rcutorture_runnable)
 			schedule_timeout_interruptible(1);
 		else
 			schedule_timeout_interruptible(round_jiffies_relative(HZ));
+		rcutorture_shutdown_absorb(title);
 	}
 }
 
@@ -289,7 +306,7 @@ rcu_torture_cb(struct rcu_head *p)
 	int i;
 	struct rcu_torture *rp = container_of(p, struct rcu_torture, rtort_rcu);
 
-	if (fullstop) {
+	if (fullstop != FULLSTOP_DONTSTOP) {
 		/* Test is ending, just drop callbacks on the floor. */
 		/* The next initialization will pick up the pieces. */
 		return;
@@ -621,10 +638,11 @@ rcu_torture_writer(void *arg)
 		}
 		rcu_torture_current_version++;
 		oldbatch = cur_ops->completed();
-		rcu_stutter_wait();
-	} while (!kthread_should_stop() && !fullstop);
+		rcu_stutter_wait("rcu_torture_writer");
+	} while (!kthread_should_stop() && fullstop == FULLSTOP_DONTSTOP);
 	VERBOSE_PRINTK_STRING("rcu_torture_writer task stopping");
-	while (!kthread_should_stop() && fullstop != FULLSTOP_SIGNALED)
+	rcutorture_shutdown_absorb("rcu_torture_writer");
+	while (!kthread_should_stop())
 		schedule_timeout_uninterruptible(1);
 	return 0;
 }
@@ -645,11 +663,12 @@ rcu_torture_fakewriter(void *arg)
 		schedule_timeout_uninterruptible(1 + rcu_random(&rand)%10);
 		udelay(rcu_random(&rand) & 0x3ff);
 		cur_ops->sync();
-		rcu_stutter_wait();
-	} while (!kthread_should_stop() && !fullstop);
+		rcu_stutter_wait("rcu_torture_fakewriter");
+	} while (!kthread_should_stop() && fullstop == FULLSTOP_DONTSTOP);
 
 	VERBOSE_PRINTK_STRING("rcu_torture_fakewriter task stopping");
-	while (!kthread_should_stop() && fullstop != FULLSTOP_SIGNALED)
+	rcutorture_shutdown_absorb("rcu_torture_fakewriter");
+	while (!kthread_should_stop())
 		schedule_timeout_uninterruptible(1);
 	return 0;
 }
@@ -754,12 +773,13 @@ rcu_torture_reader(void *arg)
 		preempt_enable();
 		cur_ops->readunlock(idx);
 		schedule();
-		rcu_stutter_wait();
-	} while (!kthread_should_stop() && !fullstop);
+		rcu_stutter_wait("rcu_torture_reader");
+	} while (!kthread_should_stop() && fullstop == FULLSTOP_DONTSTOP);
 	VERBOSE_PRINTK_STRING("rcu_torture_reader task stopping");
+	rcutorture_shutdown_absorb("rcu_torture_reader");
 	if (irqreader && cur_ops->irqcapable)
 		del_timer_sync(&t);
-	while (!kthread_should_stop() && fullstop != FULLSTOP_SIGNALED)
+	while (!kthread_should_stop())
 		schedule_timeout_uninterruptible(1);
 	return 0;
 }
@@ -856,7 +876,8 @@ rcu_torture_stats(void *arg)
 	do {
 		schedule_timeout_interruptible(stat_interval * HZ);
 		rcu_torture_stats_print();
-	} while (!kthread_should_stop() && !fullstop);
+		rcutorture_shutdown_absorb("rcu_torture_stats");
+	} while (!kthread_should_stop());
 	VERBOSE_PRINTK_STRING("rcu_torture_stats task stopping");
 	return 0;
 }
@@ -924,7 +945,8 @@ rcu_torture_shuffle(void *arg)
 	do {
 		schedule_timeout_interruptible(shuffle_interval * HZ);
 		rcu_torture_shuffle_tasks();
-	} while (!kthread_should_stop() && !fullstop);
+		rcutorture_shutdown_absorb("rcu_torture_shuffle");
+	} while (!kthread_should_stop());
 	VERBOSE_PRINTK_STRING("rcu_torture_shuffle task stopping");
 	return 0;
 }
@@ -939,10 +961,11 @@ rcu_torture_stutter(void *arg)
 	do {
 		schedule_timeout_interruptible(stutter * HZ);
 		stutter_pause_test = 1;
-		if (!kthread_should_stop() && !fullstop)
+		if (!kthread_should_stop())
 			schedule_timeout_interruptible(stutter * HZ);
 		stutter_pause_test = 0;
-	} while (!kthread_should_stop() && !fullstop);
+		rcutorture_shutdown_absorb("rcu_torture_stutter");
+	} while (!kthread_should_stop());
 	VERBOSE_PRINTK_STRING("rcu_torture_stutter task stopping");
 	return 0;
 }
@@ -969,15 +992,16 @@ rcu_torture_cleanup(void)
 	int i;
 
 	mutex_lock(&fullstop_mutex);
-	if (!fullstop) {
-		/* If being signaled, let it happen, then exit. */
+	if (fullstop == FULLSTOP_SHUTDOWN) {
+		printk(KERN_WARNING /* but going down anyway, so... */
+		       "Concurrent 'rmmod rcutorture' and shutdown illegal!\n");
 		mutex_unlock(&fullstop_mutex);
-		schedule_timeout_interruptible(10 * HZ);
+		schedule_timeout_uninterruptible(10);
 		if (cur_ops->cb_barrier != NULL)
 			cur_ops->cb_barrier();
 		return;
 	}
-	fullstop = FULLSTOP_CLEANUP;
+	fullstop = FULLSTOP_RMMOD;
 	mutex_unlock(&fullstop_mutex);
 	unregister_reboot_notifier(&rcutorture_nb);
 	if (stutter_task) {
@@ -1077,7 +1101,7 @@ rcu_torture_init(void)
 	else
 		nrealreaders = 2 * num_online_cpus();
 	rcu_torture_print_module_parms("Start of test");
-	fullstop = 0;
+	fullstop = FULLSTOP_DONTSTOP;
 
 	/* Set up the freelist. */
 

[-- Attachment #2: kernel/rcutorture.c --]
[-- Type: text/x-csrc, Size: 34295 bytes --]

/*
 * Read-Copy Update module-based torture test facility
 *
 * This program is free software; you can redistribute it and/or modify
 * it under the terms of the GNU General Public License as published by
 * the Free Software Foundation; either version 2 of the License, or
 * (at your option) any later version.
 *
 * This program is distributed in the hope that it will be useful,
 * but WITHOUT ANY WARRANTY; without even the implied warranty of
 * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
 * GNU General Public License for more details.
 *
 * You should have received a copy of the GNU General Public License
 * along with this program; if not, write to the Free Software
 * Foundation, Inc., 59 Temple Place - Suite 330, Boston, MA 02111-1307, USA.
 *
 * Copyright (C) IBM Corporation, 2005, 2006
 *
 * Authors: Paul E. McKenney <paulmck@us.ibm.com>
 *          Josh Triplett <josh@freedesktop.org>
 *
 * See also:  Documentation/RCU/torture.txt
 */
#include <linux/types.h>
#include <linux/kernel.h>
#include <linux/init.h>
#include <linux/module.h>
#include <linux/kthread.h>
#include <linux/err.h>
#include <linux/spinlock.h>
#include <linux/smp.h>
#include <linux/rcupdate.h>
#include <linux/interrupt.h>
#include <linux/sched.h>
#include <asm/atomic.h>
#include <linux/bitops.h>
#include <linux/completion.h>
#include <linux/moduleparam.h>
#include <linux/percpu.h>
#include <linux/notifier.h>
#include <linux/reboot.h>
#include <linux/freezer.h>
#include <linux/cpu.h>
#include <linux/delay.h>
#include <linux/stat.h>
#include <linux/srcu.h>
#include <linux/slab.h>
#include <asm/byteorder.h>

MODULE_LICENSE("GPL");
MODULE_AUTHOR("Paul E. McKenney <paulmck@us.ibm.com> and "
              "Josh Triplett <josh@freedesktop.org>");

static int nreaders = -1;	/* # reader threads, defaults to 2*ncpus */
static int nfakewriters = 4;	/* # fake writer threads */
static int stat_interval;	/* Interval between stats, in seconds. */
				/*  Defaults to "only at end of test". */
static int verbose;		/* Print more debug info. */
static int test_no_idle_hz;	/* Test RCU's support for tickless idle CPUs. */
static int shuffle_interval = 3; /* Interval between shuffles (in sec)*/
static int stutter = 5;		/* Start/stop testing interval (in sec) */
static int irqreader = 1;	/* RCU readers from irq (timers). */
static char *torture_type = "rcu"; /* What RCU implementation to torture. */

module_param(nreaders, int, 0444);
MODULE_PARM_DESC(nreaders, "Number of RCU reader threads");
module_param(nfakewriters, int, 0444);
MODULE_PARM_DESC(nfakewriters, "Number of RCU fake writer threads");
module_param(stat_interval, int, 0444);
MODULE_PARM_DESC(stat_interval, "Number of seconds between stats printk()s");
module_param(verbose, bool, 0444);
MODULE_PARM_DESC(verbose, "Enable verbose debugging printk()s");
module_param(test_no_idle_hz, bool, 0444);
MODULE_PARM_DESC(test_no_idle_hz, "Test support for tickless idle CPUs");
module_param(shuffle_interval, int, 0444);
MODULE_PARM_DESC(shuffle_interval, "Number of seconds between shuffles");
module_param(stutter, int, 0444);
MODULE_PARM_DESC(stutter, "Number of seconds to run/halt test");
module_param(irqreader, int, 0444);
MODULE_PARM_DESC(irqreader, "Allow RCU readers from irq handlers");
module_param(torture_type, charp, 0444);
MODULE_PARM_DESC(torture_type, "Type of RCU to torture (rcu, rcu_bh, srcu)");

#define TORTURE_FLAG "-torture:"
#define PRINTK_STRING(s) \
	do { printk(KERN_ALERT "%s" TORTURE_FLAG s "\n", torture_type); } while (0)
#define VERBOSE_PRINTK_STRING(s) \
	do { if (verbose) printk(KERN_ALERT "%s" TORTURE_FLAG s "\n", torture_type); } while (0)
#define VERBOSE_PRINTK_ERRSTRING(s) \
	do { if (verbose) printk(KERN_ALERT "%s" TORTURE_FLAG "!!! " s "\n", torture_type); } while (0)

static char printk_buf[4096];

static int nrealreaders;
static struct task_struct *writer_task;
static struct task_struct **fakewriter_tasks;
static struct task_struct **reader_tasks;
static struct task_struct *stats_task;
static struct task_struct *shuffler_task;
static struct task_struct *stutter_task;

#define RCU_TORTURE_PIPE_LEN 10

struct rcu_torture {
	struct rcu_head rtort_rcu;
	int rtort_pipe_count;
	struct list_head rtort_free;
	int rtort_mbtest;
};

static LIST_HEAD(rcu_torture_freelist);
static struct rcu_torture *rcu_torture_current = NULL;
static long rcu_torture_current_version = 0;
static struct rcu_torture rcu_tortures[10 * RCU_TORTURE_PIPE_LEN];
static DEFINE_SPINLOCK(rcu_torture_lock);
static DEFINE_PER_CPU(long [RCU_TORTURE_PIPE_LEN + 1], rcu_torture_count) =
	{ 0 };
static DEFINE_PER_CPU(long [RCU_TORTURE_PIPE_LEN + 1], rcu_torture_batch) =
	{ 0 };
static atomic_t rcu_torture_wcount[RCU_TORTURE_PIPE_LEN + 1];
static atomic_t n_rcu_torture_alloc;
static atomic_t n_rcu_torture_alloc_fail;
static atomic_t n_rcu_torture_free;
static atomic_t n_rcu_torture_mberror;
static atomic_t n_rcu_torture_error;
static long n_rcu_torture_timers = 0;
static struct list_head rcu_torture_removed;

static int stutter_pause_test = 0;

#if defined(MODULE) || defined(CONFIG_RCU_TORTURE_TEST_RUNNABLE)
#define RCUTORTURE_RUNNABLE_INIT 1
#else
#define RCUTORTURE_RUNNABLE_INIT 0
#endif
int rcutorture_runnable = RCUTORTURE_RUNNABLE_INIT;

/* Mediate rmmod and system shutdown.  Concurrent rmmod & shutdown illegal! */

#define FULLSTOP_DONTSTOP 0	/* Normal operation. */
#define FULLSTOP_SHUTDOWN 1	/* System shutdown with rcutorture running. */
#define FULLSTOP_RMMOD    2	/* Normal rmmod of rcutorture. */
static int fullstop = FULLSTOP_RMMOD;
DEFINE_MUTEX(fullstop_mutex);	/* Protect fullstop transitions and spawning */
				/*  of kthreads. */

/*
 * Detect and respond to a system shutdown.
 */
static int
rcutorture_shutdown_notify(struct notifier_block *unused1,
			   unsigned long unused2, void *unused3)
{
	mutex_lock(&fullstop_mutex);
	if (fullstop == FULLSTOP_DONTSTOP)
		fullstop = FULLSTOP_SHUTDOWN;
	else
		printk(KERN_WARNING /* but going down anyway, so... */
		       "Concurrent 'rmmod rcutorture' and shutdown illegal!\n");
	mutex_unlock(&fullstop_mutex);
	return NOTIFY_DONE;
}

/*
 * Absorb kthreads into a kernel function that won't return, so that
 * they won't ever access module text or data again.
 */
static void rcutorture_shutdown_absorb(char *title)
{
	if (ACCESS_ONCE(fullstop) == FULLSTOP_SHUTDOWN) {
		printk(KERN_NOTICE
		       "rcutorture thread %s parking due to system shutdown\n",
		       title);
		schedule_timeout_uninterruptible(MAX_SCHEDULE_TIMEOUT);
	}
}

/*
 * Allocate an element from the rcu_tortures pool.
 */
static struct rcu_torture *
rcu_torture_alloc(void)
{
	struct list_head *p;

	spin_lock_bh(&rcu_torture_lock);
	if (list_empty(&rcu_torture_freelist)) {
		atomic_inc(&n_rcu_torture_alloc_fail);
		spin_unlock_bh(&rcu_torture_lock);
		return NULL;
	}
	atomic_inc(&n_rcu_torture_alloc);
	p = rcu_torture_freelist.next;
	list_del_init(p);
	spin_unlock_bh(&rcu_torture_lock);
	return container_of(p, struct rcu_torture, rtort_free);
}

/*
 * Free an element to the rcu_tortures pool.
 */
static void
rcu_torture_free(struct rcu_torture *p)
{
	atomic_inc(&n_rcu_torture_free);
	spin_lock_bh(&rcu_torture_lock);
	list_add_tail(&p->rtort_free, &rcu_torture_freelist);
	spin_unlock_bh(&rcu_torture_lock);
}

struct rcu_random_state {
	unsigned long rrs_state;
	long rrs_count;
};

#define RCU_RANDOM_MULT 39916801  /* prime */
#define RCU_RANDOM_ADD	479001701 /* prime */
#define RCU_RANDOM_REFRESH 10000

#define DEFINE_RCU_RANDOM(name) struct rcu_random_state name = { 0, 0 }

/*
 * Crude but fast random-number generator.  Uses a linear congruential
 * generator, with occasional help from cpu_clock().
 */
static unsigned long
rcu_random(struct rcu_random_state *rrsp)
{
	if (--rrsp->rrs_count < 0) {
		rrsp->rrs_state +=
			(unsigned long)cpu_clock(raw_smp_processor_id());
		rrsp->rrs_count = RCU_RANDOM_REFRESH;
	}
	rrsp->rrs_state = rrsp->rrs_state * RCU_RANDOM_MULT + RCU_RANDOM_ADD;
	return swahw32(rrsp->rrs_state);
}

static void
rcu_stutter_wait(char *title)
{
	while (stutter_pause_test || !rcutorture_runnable) {
		if (rcutorture_runnable)
			schedule_timeout_interruptible(1);
		else
			schedule_timeout_interruptible(round_jiffies_relative(HZ));
		rcutorture_shutdown_absorb(title);
	}
}

/*
 * Operations vector for selecting different types of tests.
 */

struct rcu_torture_ops {
	void (*init)(void);
	void (*cleanup)(void);
	int (*readlock)(void);
	void (*readdelay)(struct rcu_random_state *rrsp);
	void (*readunlock)(int idx);
	int (*completed)(void);
	void (*deferredfree)(struct rcu_torture *p);
	void (*sync)(void);
	void (*cb_barrier)(void);
	int (*stats)(char *page);
	int irqcapable;
	char *name;
};
static struct rcu_torture_ops *cur_ops = NULL;

/*
 * Definitions for rcu torture testing.
 */

static int rcu_torture_read_lock(void) __acquires(RCU)
{
	rcu_read_lock();
	return 0;
}

static void rcu_read_delay(struct rcu_random_state *rrsp)
{
	long delay;
	const long longdelay = 200;

	/* We want there to be long-running readers, but not all the time. */

	delay = rcu_random(rrsp) % (nrealreaders * 2 * longdelay);
	if (!delay)
		udelay(longdelay);
}

static void rcu_torture_read_unlock(int idx) __releases(RCU)
{
	rcu_read_unlock();
}

static int rcu_torture_completed(void)
{
	return rcu_batches_completed();
}

static void
rcu_torture_cb(struct rcu_head *p)
{
	int i;
	struct rcu_torture *rp = container_of(p, struct rcu_torture, rtort_rcu);

	if (fullstop != FULLSTOP_DONTSTOP) {
		/* Test is ending, just drop callbacks on the floor. */
		/* The next initialization will pick up the pieces. */
		return;
	}
	i = rp->rtort_pipe_count;
	if (i > RCU_TORTURE_PIPE_LEN)
		i = RCU_TORTURE_PIPE_LEN;
	atomic_inc(&rcu_torture_wcount[i]);
	if (++rp->rtort_pipe_count >= RCU_TORTURE_PIPE_LEN) {
		rp->rtort_mbtest = 0;
		rcu_torture_free(rp);
	} else
		cur_ops->deferredfree(rp);
}

static void rcu_torture_deferred_free(struct rcu_torture *p)
{
	call_rcu(&p->rtort_rcu, rcu_torture_cb);
}

static struct rcu_torture_ops rcu_ops = {
	.init = NULL,
	.cleanup = NULL,
	.readlock = rcu_torture_read_lock,
	.readdelay = rcu_read_delay,
	.readunlock = rcu_torture_read_unlock,
	.completed = rcu_torture_completed,
	.deferredfree = rcu_torture_deferred_free,
	.sync = synchronize_rcu,
	.cb_barrier = rcu_barrier,
	.stats = NULL,
	.irqcapable = 1,
	.name = "rcu"
};

static void rcu_sync_torture_deferred_free(struct rcu_torture *p)
{
	int i;
	struct rcu_torture *rp;
	struct rcu_torture *rp1;

	cur_ops->sync();
	list_add(&p->rtort_free, &rcu_torture_removed);
	list_for_each_entry_safe(rp, rp1, &rcu_torture_removed, rtort_free) {
		i = rp->rtort_pipe_count;
		if (i > RCU_TORTURE_PIPE_LEN)
			i = RCU_TORTURE_PIPE_LEN;
		atomic_inc(&rcu_torture_wcount[i]);
		if (++rp->rtort_pipe_count >= RCU_TORTURE_PIPE_LEN) {
			rp->rtort_mbtest = 0;
			list_del(&rp->rtort_free);
			rcu_torture_free(rp);
		}
	}
}

static void rcu_sync_torture_init(void)
{
	INIT_LIST_HEAD(&rcu_torture_removed);
}

static struct rcu_torture_ops rcu_sync_ops = {
	.init = rcu_sync_torture_init,
	.cleanup = NULL,
	.readlock = rcu_torture_read_lock,
	.readdelay = rcu_read_delay,
	.readunlock = rcu_torture_read_unlock,
	.completed = rcu_torture_completed,
	.deferredfree = rcu_sync_torture_deferred_free,
	.sync = synchronize_rcu,
	.cb_barrier = NULL,
	.stats = NULL,
	.irqcapable = 1,
	.name = "rcu_sync"
};

/*
 * Definitions for rcu_bh torture testing.
 */

static int rcu_bh_torture_read_lock(void) __acquires(RCU_BH)
{
	rcu_read_lock_bh();
	return 0;
}

static void rcu_bh_torture_read_unlock(int idx) __releases(RCU_BH)
{
	rcu_read_unlock_bh();
}

static int rcu_bh_torture_completed(void)
{
	return rcu_batches_completed_bh();
}

static void rcu_bh_torture_deferred_free(struct rcu_torture *p)
{
	call_rcu_bh(&p->rtort_rcu, rcu_torture_cb);
}

struct rcu_bh_torture_synchronize {
	struct rcu_head head;
	struct completion completion;
};

static void rcu_bh_torture_wakeme_after_cb(struct rcu_head *head)
{
	struct rcu_bh_torture_synchronize *rcu;

	rcu = container_of(head, struct rcu_bh_torture_synchronize, head);
	complete(&rcu->completion);
}

static void rcu_bh_torture_synchronize(void)
{
	struct rcu_bh_torture_synchronize rcu;

	init_completion(&rcu.completion);
	call_rcu_bh(&rcu.head, rcu_bh_torture_wakeme_after_cb);
	wait_for_completion(&rcu.completion);
}

static struct rcu_torture_ops rcu_bh_ops = {
	.init = NULL,
	.cleanup = NULL,
	.readlock = rcu_bh_torture_read_lock,
	.readdelay = rcu_read_delay,  /* just reuse rcu's version. */
	.readunlock = rcu_bh_torture_read_unlock,
	.completed = rcu_bh_torture_completed,
	.deferredfree = rcu_bh_torture_deferred_free,
	.sync = rcu_bh_torture_synchronize,
	.cb_barrier = rcu_barrier_bh,
	.stats = NULL,
	.irqcapable = 1,
	.name = "rcu_bh"
};

static struct rcu_torture_ops rcu_bh_sync_ops = {
	.init = rcu_sync_torture_init,
	.cleanup = NULL,
	.readlock = rcu_bh_torture_read_lock,
	.readdelay = rcu_read_delay,  /* just reuse rcu's version. */
	.readunlock = rcu_bh_torture_read_unlock,
	.completed = rcu_bh_torture_completed,
	.deferredfree = rcu_sync_torture_deferred_free,
	.sync = rcu_bh_torture_synchronize,
	.cb_barrier = NULL,
	.stats = NULL,
	.irqcapable = 1,
	.name = "rcu_bh_sync"
};

/*
 * Definitions for srcu torture testing.
 */

static struct srcu_struct srcu_ctl;

static void srcu_torture_init(void)
{
	init_srcu_struct(&srcu_ctl);
	rcu_sync_torture_init();
}

static void srcu_torture_cleanup(void)
{
	synchronize_srcu(&srcu_ctl);
	cleanup_srcu_struct(&srcu_ctl);
}

static int srcu_torture_read_lock(void) __acquires(&srcu_ctl)
{
	return srcu_read_lock(&srcu_ctl);
}

static void srcu_read_delay(struct rcu_random_state *rrsp)
{
	long delay;
	const long uspertick = 1000000 / HZ;
	const long longdelay = 10;

	/* We want there to be long-running readers, but not all the time. */

	delay = rcu_random(rrsp) % (nrealreaders * 2 * longdelay * uspertick);
	if (!delay)
		schedule_timeout_interruptible(longdelay);
}

static void srcu_torture_read_unlock(int idx) __releases(&srcu_ctl)
{
	srcu_read_unlock(&srcu_ctl, idx);
}

static int srcu_torture_completed(void)
{
	return srcu_batches_completed(&srcu_ctl);
}

static void srcu_torture_synchronize(void)
{
	synchronize_srcu(&srcu_ctl);
}

static int srcu_torture_stats(char *page)
{
	int cnt = 0;
	int cpu;
	int idx = srcu_ctl.completed & 0x1;

	cnt += sprintf(&page[cnt], "%s%s per-CPU(idx=%d):",
		       torture_type, TORTURE_FLAG, idx);
	for_each_possible_cpu(cpu) {
		cnt += sprintf(&page[cnt], " %d(%d,%d)", cpu,
			       per_cpu_ptr(srcu_ctl.per_cpu_ref, cpu)->c[!idx],
			       per_cpu_ptr(srcu_ctl.per_cpu_ref, cpu)->c[idx]);
	}
	cnt += sprintf(&page[cnt], "\n");
	return cnt;
}

static struct rcu_torture_ops srcu_ops = {
	.init = srcu_torture_init,
	.cleanup = srcu_torture_cleanup,
	.readlock = srcu_torture_read_lock,
	.readdelay = srcu_read_delay,
	.readunlock = srcu_torture_read_unlock,
	.completed = srcu_torture_completed,
	.deferredfree = rcu_sync_torture_deferred_free,
	.sync = srcu_torture_synchronize,
	.cb_barrier = NULL,
	.stats = srcu_torture_stats,
	.name = "srcu"
};

/*
 * Definitions for sched torture testing.
 */

static int sched_torture_read_lock(void)
{
	preempt_disable();
	return 0;
}

static void sched_torture_read_unlock(int idx)
{
	preempt_enable();
}

static int sched_torture_completed(void)
{
	return 0;
}

static void rcu_sched_torture_deferred_free(struct rcu_torture *p)
{
	call_rcu_sched(&p->rtort_rcu, rcu_torture_cb);
}

static void sched_torture_synchronize(void)
{
	synchronize_sched();
}

static struct rcu_torture_ops sched_ops = {
	.init = rcu_sync_torture_init,
	.cleanup = NULL,
	.readlock = sched_torture_read_lock,
	.readdelay = rcu_read_delay,  /* just reuse rcu's version. */
	.readunlock = sched_torture_read_unlock,
	.completed = sched_torture_completed,
	.deferredfree = rcu_sched_torture_deferred_free,
	.sync = sched_torture_synchronize,
	.cb_barrier = rcu_barrier_sched,
	.stats = NULL,
	.irqcapable = 1,
	.name = "sched"
};

static struct rcu_torture_ops sched_ops_sync = {
	.init = rcu_sync_torture_init,
	.cleanup = NULL,
	.readlock = sched_torture_read_lock,
	.readdelay = rcu_read_delay,  /* just reuse rcu's version. */
	.readunlock = sched_torture_read_unlock,
	.completed = sched_torture_completed,
	.deferredfree = rcu_sync_torture_deferred_free,
	.sync = sched_torture_synchronize,
	.cb_barrier = NULL,
	.stats = NULL,
	.name = "sched_sync"
};

/*
 * RCU torture writer kthread.  Repeatedly substitutes a new structure
 * for that pointed to by rcu_torture_current, freeing the old structure
 * after a series of grace periods (the "pipeline").
 */
static int
rcu_torture_writer(void *arg)
{
	int i;
	long oldbatch = rcu_batches_completed();
	struct rcu_torture *rp;
	struct rcu_torture *old_rp;
	static DEFINE_RCU_RANDOM(rand);

	VERBOSE_PRINTK_STRING("rcu_torture_writer task started");
	set_user_nice(current, 19);

	do {
		schedule_timeout_uninterruptible(1);
		if ((rp = rcu_torture_alloc()) == NULL)
			continue;
		rp->rtort_pipe_count = 0;
		udelay(rcu_random(&rand) & 0x3ff);
		old_rp = rcu_torture_current;
		rp->rtort_mbtest = 1;
		rcu_assign_pointer(rcu_torture_current, rp);
		smp_wmb();
		if (old_rp) {
			i = old_rp->rtort_pipe_count;
			if (i > RCU_TORTURE_PIPE_LEN)
				i = RCU_TORTURE_PIPE_LEN;
			atomic_inc(&rcu_torture_wcount[i]);
			old_rp->rtort_pipe_count++;
			cur_ops->deferredfree(old_rp);
		}
		rcu_torture_current_version++;
		oldbatch = cur_ops->completed();
		rcu_stutter_wait("rcu_torture_writer");
	} while (!kthread_should_stop() && fullstop == FULLSTOP_DONTSTOP);
	VERBOSE_PRINTK_STRING("rcu_torture_writer task stopping");
	rcutorture_shutdown_absorb("rcu_torture_writer");
	while (!kthread_should_stop())
		schedule_timeout_uninterruptible(1);
	return 0;
}

/*
 * RCU torture fake writer kthread.  Repeatedly calls sync, with a random
 * delay between calls.
 */
static int
rcu_torture_fakewriter(void *arg)
{
	DEFINE_RCU_RANDOM(rand);

	VERBOSE_PRINTK_STRING("rcu_torture_fakewriter task started");
	set_user_nice(current, 19);

	do {
		schedule_timeout_uninterruptible(1 + rcu_random(&rand)%10);
		udelay(rcu_random(&rand) & 0x3ff);
		cur_ops->sync();
		rcu_stutter_wait("rcu_torture_fakewriter");
	} while (!kthread_should_stop() && fullstop == FULLSTOP_DONTSTOP);

	VERBOSE_PRINTK_STRING("rcu_torture_fakewriter task stopping");
	rcutorture_shutdown_absorb("rcu_torture_fakewriter");
	while (!kthread_should_stop())
		schedule_timeout_uninterruptible(1);
	return 0;
}

/*
 * RCU torture reader from timer handler.  Dereferences rcu_torture_current,
 * incrementing the corresponding element of the pipeline array.  The
 * counter in the element should never be greater than 1, otherwise, the
 * RCU implementation is broken.
 */
static void rcu_torture_timer(unsigned long unused)
{
	int idx;
	int completed;
	static DEFINE_RCU_RANDOM(rand);
	static DEFINE_SPINLOCK(rand_lock);
	struct rcu_torture *p;
	int pipe_count;

	idx = cur_ops->readlock();
	completed = cur_ops->completed();
	p = rcu_dereference(rcu_torture_current);
	if (p == NULL) {
		/* Leave because rcu_torture_writer is not yet underway */
		cur_ops->readunlock(idx);
		return;
	}
	if (p->rtort_mbtest == 0)
		atomic_inc(&n_rcu_torture_mberror);
	spin_lock(&rand_lock);
	cur_ops->readdelay(&rand);
	n_rcu_torture_timers++;
	spin_unlock(&rand_lock);
	preempt_disable();
	pipe_count = p->rtort_pipe_count;
	if (pipe_count > RCU_TORTURE_PIPE_LEN) {
		/* Should not happen, but... */
		pipe_count = RCU_TORTURE_PIPE_LEN;
	}
	++__get_cpu_var(rcu_torture_count)[pipe_count];
	completed = cur_ops->completed() - completed;
	if (completed > RCU_TORTURE_PIPE_LEN) {
		/* Should not happen, but... */
		completed = RCU_TORTURE_PIPE_LEN;
	}
	++__get_cpu_var(rcu_torture_batch)[completed];
	preempt_enable();
	cur_ops->readunlock(idx);
}

/*
 * RCU torture reader kthread.  Repeatedly dereferences rcu_torture_current,
 * incrementing the corresponding element of the pipeline array.  The
 * counter in the element should never be greater than 1, otherwise, the
 * RCU implementation is broken.
 */
static int
rcu_torture_reader(void *arg)
{
	int completed;
	int idx;
	DEFINE_RCU_RANDOM(rand);
	struct rcu_torture *p;
	int pipe_count;
	struct timer_list t;

	VERBOSE_PRINTK_STRING("rcu_torture_reader task started");
	set_user_nice(current, 19);
	if (irqreader && cur_ops->irqcapable)
		setup_timer_on_stack(&t, rcu_torture_timer, 0);

	do {
		if (irqreader && cur_ops->irqcapable) {
			if (!timer_pending(&t))
				mod_timer(&t, 1);
		}
		idx = cur_ops->readlock();
		completed = cur_ops->completed();
		p = rcu_dereference(rcu_torture_current);
		if (p == NULL) {
			/* Wait for rcu_torture_writer to get underway */
			cur_ops->readunlock(idx);
			schedule_timeout_interruptible(HZ);
			continue;
		}
		if (p->rtort_mbtest == 0)
			atomic_inc(&n_rcu_torture_mberror);
		cur_ops->readdelay(&rand);
		preempt_disable();
		pipe_count = p->rtort_pipe_count;
		if (pipe_count > RCU_TORTURE_PIPE_LEN) {
			/* Should not happen, but... */
			pipe_count = RCU_TORTURE_PIPE_LEN;
		}
		++__get_cpu_var(rcu_torture_count)[pipe_count];
		completed = cur_ops->completed() - completed;
		if (completed > RCU_TORTURE_PIPE_LEN) {
			/* Should not happen, but... */
			completed = RCU_TORTURE_PIPE_LEN;
		}
		++__get_cpu_var(rcu_torture_batch)[completed];
		preempt_enable();
		cur_ops->readunlock(idx);
		schedule();
		rcu_stutter_wait("rcu_torture_reader");
	} while (!kthread_should_stop() && fullstop == FULLSTOP_DONTSTOP);
	VERBOSE_PRINTK_STRING("rcu_torture_reader task stopping");
	rcutorture_shutdown_absorb("rcu_torture_reader");
	if (irqreader && cur_ops->irqcapable)
		del_timer_sync(&t);
	while (!kthread_should_stop())
		schedule_timeout_uninterruptible(1);
	return 0;
}

/*
 * Create an RCU-torture statistics message in the specified buffer.
 */
static int
rcu_torture_printk(char *page)
{
	int cnt = 0;
	int cpu;
	int i;
	long pipesummary[RCU_TORTURE_PIPE_LEN + 1] = { 0 };
	long batchsummary[RCU_TORTURE_PIPE_LEN + 1] = { 0 };

	for_each_possible_cpu(cpu) {
		for (i = 0; i < RCU_TORTURE_PIPE_LEN + 1; i++) {
			pipesummary[i] += per_cpu(rcu_torture_count, cpu)[i];
			batchsummary[i] += per_cpu(rcu_torture_batch, cpu)[i];
		}
	}
	for (i = RCU_TORTURE_PIPE_LEN - 1; i >= 0; i--) {
		if (pipesummary[i] != 0)
			break;
	}
	cnt += sprintf(&page[cnt], "%s%s ", torture_type, TORTURE_FLAG);
	cnt += sprintf(&page[cnt],
		       "rtc: %p ver: %ld tfle: %d rta: %d rtaf: %d rtf: %d "
		       "rtmbe: %d nt: %ld",
		       rcu_torture_current,
		       rcu_torture_current_version,
		       list_empty(&rcu_torture_freelist),
		       atomic_read(&n_rcu_torture_alloc),
		       atomic_read(&n_rcu_torture_alloc_fail),
		       atomic_read(&n_rcu_torture_free),
		       atomic_read(&n_rcu_torture_mberror),
		       n_rcu_torture_timers);
	if (atomic_read(&n_rcu_torture_mberror) != 0)
		cnt += sprintf(&page[cnt], " !!!");
	cnt += sprintf(&page[cnt], "\n%s%s ", torture_type, TORTURE_FLAG);
	if (i > 1) {
		cnt += sprintf(&page[cnt], "!!! ");
		atomic_inc(&n_rcu_torture_error);
		WARN_ON_ONCE(1);
	}
	cnt += sprintf(&page[cnt], "Reader Pipe: ");
	for (i = 0; i < RCU_TORTURE_PIPE_LEN + 1; i++)
		cnt += sprintf(&page[cnt], " %ld", pipesummary[i]);
	cnt += sprintf(&page[cnt], "\n%s%s ", torture_type, TORTURE_FLAG);
	cnt += sprintf(&page[cnt], "Reader Batch: ");
	for (i = 0; i < RCU_TORTURE_PIPE_LEN + 1; i++)
		cnt += sprintf(&page[cnt], " %ld", batchsummary[i]);
	cnt += sprintf(&page[cnt], "\n%s%s ", torture_type, TORTURE_FLAG);
	cnt += sprintf(&page[cnt], "Free-Block Circulation: ");
	for (i = 0; i < RCU_TORTURE_PIPE_LEN + 1; i++) {
		cnt += sprintf(&page[cnt], " %d",
			       atomic_read(&rcu_torture_wcount[i]));
	}
	cnt += sprintf(&page[cnt], "\n");
	if (cur_ops->stats)
		cnt += cur_ops->stats(&page[cnt]);
	return cnt;
}

/*
 * Print torture statistics.  Caller must ensure that there is only
 * one call to this function at a given time!!!  This is normally
 * accomplished by relying on the module system to only have one copy
 * of the module loaded, and then by giving the rcu_torture_stats
 * kthread full control (or the init/cleanup functions when rcu_torture_stats
 * thread is not running).
 */
static void
rcu_torture_stats_print(void)
{
	int cnt;

	cnt = rcu_torture_printk(printk_buf);
	printk(KERN_ALERT "%s", printk_buf);
}

/*
 * Periodically prints torture statistics, if periodic statistics printing
 * was specified via the stat_interval module parameter.
 *
 * No need to worry about fullstop here, since this one doesn't reference
 * volatile state or register callbacks.
 */
static int
rcu_torture_stats(void *arg)
{
	VERBOSE_PRINTK_STRING("rcu_torture_stats task started");
	do {
		schedule_timeout_interruptible(stat_interval * HZ);
		rcu_torture_stats_print();
		rcutorture_shutdown_absorb("rcu_torture_stats");
	} while (!kthread_should_stop());
	VERBOSE_PRINTK_STRING("rcu_torture_stats task stopping");
	return 0;
}

static int rcu_idle_cpu;	/* Force all torture tasks off this CPU */

/* Shuffle tasks such that we allow @rcu_idle_cpu to become idle. A special case
 * is when @rcu_idle_cpu = -1, when we allow the tasks to run on all CPUs.
 */
static void rcu_torture_shuffle_tasks(void)
{
	cpumask_t tmp_mask;
	int i;

	cpus_setall(tmp_mask);
	get_online_cpus();

	/* No point in shuffling if there is only one online CPU (ex: UP) */
	if (num_online_cpus() == 1) {
		put_online_cpus();
		return;
	}

	if (rcu_idle_cpu != -1)
		cpu_clear(rcu_idle_cpu, tmp_mask);

	set_cpus_allowed_ptr(current, &tmp_mask);

	if (reader_tasks) {
		for (i = 0; i < nrealreaders; i++)
			if (reader_tasks[i])
				set_cpus_allowed_ptr(reader_tasks[i],
						     &tmp_mask);
	}

	if (fakewriter_tasks) {
		for (i = 0; i < nfakewriters; i++)
			if (fakewriter_tasks[i])
				set_cpus_allowed_ptr(fakewriter_tasks[i],
						     &tmp_mask);
	}

	if (writer_task)
		set_cpus_allowed_ptr(writer_task, &tmp_mask);

	if (stats_task)
		set_cpus_allowed_ptr(stats_task, &tmp_mask);

	if (rcu_idle_cpu == -1)
		rcu_idle_cpu = num_online_cpus() - 1;
	else
		rcu_idle_cpu--;

	put_online_cpus();
}

/* Shuffle tasks across CPUs, with the intent of allowing each CPU in the
 * system to become idle at a time and cut off its timer ticks. This is meant
 * to test the support for such tickless idle CPU in RCU.
 */
static int
rcu_torture_shuffle(void *arg)
{
	VERBOSE_PRINTK_STRING("rcu_torture_shuffle task started");
	do {
		schedule_timeout_interruptible(shuffle_interval * HZ);
		rcu_torture_shuffle_tasks();
		rcutorture_shutdown_absorb("rcu_torture_shuffle");
	} while (!kthread_should_stop());
	VERBOSE_PRINTK_STRING("rcu_torture_shuffle task stopping");
	return 0;
}

/* Cause the rcutorture test to "stutter", starting and stopping all
 * threads periodically.
 */
static int
rcu_torture_stutter(void *arg)
{
	VERBOSE_PRINTK_STRING("rcu_torture_stutter task started");
	do {
		schedule_timeout_interruptible(stutter * HZ);
		stutter_pause_test = 1;
		if (!kthread_should_stop())
			schedule_timeout_interruptible(stutter * HZ);
		stutter_pause_test = 0;
		rcutorture_shutdown_absorb("rcu_torture_stutter");
	} while (!kthread_should_stop());
	VERBOSE_PRINTK_STRING("rcu_torture_stutter task stopping");
	return 0;
}

static inline void
rcu_torture_print_module_parms(char *tag)
{
	printk(KERN_ALERT "%s" TORTURE_FLAG
		"--- %s: nreaders=%d nfakewriters=%d "
		"stat_interval=%d verbose=%d test_no_idle_hz=%d "
		"shuffle_interval=%d stutter=%d irqreader=%d\n",
		torture_type, tag, nrealreaders, nfakewriters,
		stat_interval, verbose, test_no_idle_hz, shuffle_interval,
		stutter, irqreader);
}

static struct notifier_block rcutorture_nb = {
	.notifier_call = rcutorture_shutdown_notify,
};

static void
rcu_torture_cleanup(void)
{
	int i;

	mutex_lock(&fullstop_mutex);
	if (fullstop == FULLSTOP_SHUTDOWN) {
		printk(KERN_WARNING /* but going down anyway, so... */
		       "Concurrent 'rmmod rcutorture' and shutdown illegal!\n");
		mutex_unlock(&fullstop_mutex);
		schedule_timeout_uninterruptible(10);
		if (cur_ops->cb_barrier != NULL)
			cur_ops->cb_barrier();
		return;
	}
	fullstop = FULLSTOP_RMMOD;
	mutex_unlock(&fullstop_mutex);
	unregister_reboot_notifier(&rcutorture_nb);
	if (stutter_task) {
		VERBOSE_PRINTK_STRING("Stopping rcu_torture_stutter task");
		kthread_stop(stutter_task);
	}
	stutter_task = NULL;
	if (shuffler_task) {
		VERBOSE_PRINTK_STRING("Stopping rcu_torture_shuffle task");
		kthread_stop(shuffler_task);
	}
	shuffler_task = NULL;

	if (writer_task) {
		VERBOSE_PRINTK_STRING("Stopping rcu_torture_writer task");
		kthread_stop(writer_task);
	}
	writer_task = NULL;

	if (reader_tasks) {
		for (i = 0; i < nrealreaders; i++) {
			if (reader_tasks[i]) {
				VERBOSE_PRINTK_STRING(
					"Stopping rcu_torture_reader task");
				kthread_stop(reader_tasks[i]);
			}
			reader_tasks[i] = NULL;
		}
		kfree(reader_tasks);
		reader_tasks = NULL;
	}
	rcu_torture_current = NULL;

	if (fakewriter_tasks) {
		for (i = 0; i < nfakewriters; i++) {
			if (fakewriter_tasks[i]) {
				VERBOSE_PRINTK_STRING(
					"Stopping rcu_torture_fakewriter task");
				kthread_stop(fakewriter_tasks[i]);
			}
			fakewriter_tasks[i] = NULL;
		}
		kfree(fakewriter_tasks);
		fakewriter_tasks = NULL;
	}

	if (stats_task) {
		VERBOSE_PRINTK_STRING("Stopping rcu_torture_stats task");
		kthread_stop(stats_task);
	}
	stats_task = NULL;

	/* Wait for all RCU callbacks to fire.  */

	if (cur_ops->cb_barrier != NULL)
		cur_ops->cb_barrier();

	rcu_torture_stats_print();  /* -After- the stats thread is stopped! */

	if (cur_ops->cleanup)
		cur_ops->cleanup();
	if (atomic_read(&n_rcu_torture_error))
		rcu_torture_print_module_parms("End of test: FAILURE");
	else
		rcu_torture_print_module_parms("End of test: SUCCESS");
}

static int __init
rcu_torture_init(void)
{
	int i;
	int cpu;
	int firsterr = 0;
	static struct rcu_torture_ops *torture_ops[] =
		{ &rcu_ops, &rcu_sync_ops, &rcu_bh_ops, &rcu_bh_sync_ops,
		  &srcu_ops, &sched_ops, &sched_ops_sync, };

	mutex_lock(&fullstop_mutex);

	/* Process args and tell the world that the torturer is on the job. */
	for (i = 0; i < ARRAY_SIZE(torture_ops); i++) {
		cur_ops = torture_ops[i];
		if (strcmp(torture_type, cur_ops->name) == 0)
			break;
	}
	if (i == ARRAY_SIZE(torture_ops)) {
		printk(KERN_ALERT "rcutorture: invalid torture type: \"%s\"\n",
		       torture_type);
		mutex_unlock(&fullstop_mutex);
		return (-EINVAL);
	}
	if (cur_ops->init)
		cur_ops->init(); /* no "goto unwind" prior to this point!!! */

	if (nreaders >= 0)
		nrealreaders = nreaders;
	else
		nrealreaders = 2 * num_online_cpus();
	rcu_torture_print_module_parms("Start of test");
	fullstop = FULLSTOP_DONTSTOP;

	/* Set up the freelist. */

	INIT_LIST_HEAD(&rcu_torture_freelist);
	for (i = 0; i < ARRAY_SIZE(rcu_tortures); i++) {
		rcu_tortures[i].rtort_mbtest = 0;
		list_add_tail(&rcu_tortures[i].rtort_free,
			      &rcu_torture_freelist);
	}

	/* Initialize the statistics so that each run gets its own numbers. */

	rcu_torture_current = NULL;
	rcu_torture_current_version = 0;
	atomic_set(&n_rcu_torture_alloc, 0);
	atomic_set(&n_rcu_torture_alloc_fail, 0);
	atomic_set(&n_rcu_torture_free, 0);
	atomic_set(&n_rcu_torture_mberror, 0);
	atomic_set(&n_rcu_torture_error, 0);
	for (i = 0; i < RCU_TORTURE_PIPE_LEN + 1; i++)
		atomic_set(&rcu_torture_wcount[i], 0);
	for_each_possible_cpu(cpu) {
		for (i = 0; i < RCU_TORTURE_PIPE_LEN + 1; i++) {
			per_cpu(rcu_torture_count, cpu)[i] = 0;
			per_cpu(rcu_torture_batch, cpu)[i] = 0;
		}
	}

	/* Start up the kthreads. */

	VERBOSE_PRINTK_STRING("Creating rcu_torture_writer task");
	writer_task = kthread_run(rcu_torture_writer, NULL,
				  "rcu_torture_writer");
	if (IS_ERR(writer_task)) {
		firsterr = PTR_ERR(writer_task);
		VERBOSE_PRINTK_ERRSTRING("Failed to create writer");
		writer_task = NULL;
		goto unwind;
	}
	fakewriter_tasks = kzalloc(nfakewriters * sizeof(fakewriter_tasks[0]),
	                           GFP_KERNEL);
	if (fakewriter_tasks == NULL) {
		VERBOSE_PRINTK_ERRSTRING("out of memory");
		firsterr = -ENOMEM;
		goto unwind;
	}
	for (i = 0; i < nfakewriters; i++) {
		VERBOSE_PRINTK_STRING("Creating rcu_torture_fakewriter task");
		fakewriter_tasks[i] = kthread_run(rcu_torture_fakewriter, NULL,
		                                  "rcu_torture_fakewriter");
		if (IS_ERR(fakewriter_tasks[i])) {
			firsterr = PTR_ERR(fakewriter_tasks[i]);
			VERBOSE_PRINTK_ERRSTRING("Failed to create fakewriter");
			fakewriter_tasks[i] = NULL;
			goto unwind;
		}
	}
	reader_tasks = kzalloc(nrealreaders * sizeof(reader_tasks[0]),
			       GFP_KERNEL);
	if (reader_tasks == NULL) {
		VERBOSE_PRINTK_ERRSTRING("out of memory");
		firsterr = -ENOMEM;
		goto unwind;
	}
	for (i = 0; i < nrealreaders; i++) {
		VERBOSE_PRINTK_STRING("Creating rcu_torture_reader task");
		reader_tasks[i] = kthread_run(rcu_torture_reader, NULL,
					      "rcu_torture_reader");
		if (IS_ERR(reader_tasks[i])) {
			firsterr = PTR_ERR(reader_tasks[i]);
			VERBOSE_PRINTK_ERRSTRING("Failed to create reader");
			reader_tasks[i] = NULL;
			goto unwind;
		}
	}
	if (stat_interval > 0) {
		VERBOSE_PRINTK_STRING("Creating rcu_torture_stats task");
		stats_task = kthread_run(rcu_torture_stats, NULL,
					"rcu_torture_stats");
		if (IS_ERR(stats_task)) {
			firsterr = PTR_ERR(stats_task);
			VERBOSE_PRINTK_ERRSTRING("Failed to create stats");
			stats_task = NULL;
			goto unwind;
		}
	}
	if (test_no_idle_hz) {
		rcu_idle_cpu = num_online_cpus() - 1;
		/* Create the shuffler thread */
		shuffler_task = kthread_run(rcu_torture_shuffle, NULL,
					  "rcu_torture_shuffle");
		if (IS_ERR(shuffler_task)) {
			firsterr = PTR_ERR(shuffler_task);
			VERBOSE_PRINTK_ERRSTRING("Failed to create shuffler");
			shuffler_task = NULL;
			goto unwind;
		}
	}
	if (stutter < 0)
		stutter = 0;
	if (stutter) {
		/* Create the stutter thread */
		stutter_task = kthread_run(rcu_torture_stutter, NULL,
					  "rcu_torture_stutter");
		if (IS_ERR(stutter_task)) {
			firsterr = PTR_ERR(stutter_task);
			VERBOSE_PRINTK_ERRSTRING("Failed to create stutter");
			stutter_task = NULL;
			goto unwind;
		}
	}
	register_reboot_notifier(&rcutorture_nb);
	mutex_unlock(&fullstop_mutex);
	return 0;

unwind:
	mutex_unlock(&fullstop_mutex);
	rcu_torture_cleanup();
	return firsterr;
}

module_init(rcu_torture_init);
module_exit(rcu_torture_cleanup);

^ permalink raw reply related	[flat|nested] 27+ messages in thread

* Re: [BUG] NULL pointer deref with rcutorture
  2009-01-07 19:46                                               ` Paul E. McKenney
@ 2009-01-07 20:19                                                 ` Eric Sesterhenn
  2009-01-07 22:06                                                   ` Paul E. McKenney
  0 siblings, 1 reply; 27+ messages in thread
From: Eric Sesterhenn @ 2009-01-07 20:19 UTC (permalink / raw)
  To: Paul E. McKenney; +Cc: Kamalesh Babulal, linux-kernel, josh, dipankar

* Paul E. McKenney (paulmck@linux.vnet.ibm.com) wrote:
> On Tue, Jan 06, 2009 at 04:48:09AM -0800, Paul E. McKenney wrote:
> > On Tue, Jan 06, 2009 at 08:47:07AM +0100, Eric Sesterhenn wrote:
> > > * Paul E. McKenney (paulmck@linux.vnet.ibm.com) wrote:
> > > > On Mon, Jan 05, 2009 at 04:29:12PM -0800, Paul E. McKenney wrote:
> > > > > On Mon, Jan 05, 2009 at 02:18:36PM -0800, Paul E. McKenney wrote:
> > > > > I will see what I can find from this.
> > > > 
> > > > Hello, Eric,
> > > > 
> > > > Just for grins, I built the 2.6.28 kernel/rcutorture.c on b58602a.  It
> > > > insmoded and rmmoded without errors.
> > > > 
> > > > Could you please try this on your setup?  I have attached this file in
> > > > case that helps.
> > > 
> > > works like a charm
> > 
> > Well, that certainly tells me where the bug lies.  ;-)
> > 
> > Thank you for all your hard work on this, I should have a patch soon.
> 
> And here it is, against the original version of rcutorture.c (revert
> all of my previous patches against rcutorture.c).  I am also attaching
> the resulting rcutorture.c, if that makes things more convenient for you.
> 
> Could you please fire this up and see if it fixes the rmmod-time bug
> that you located?

it does fix the bug, thanks for your work

Greetings, Eric

^ permalink raw reply	[flat|nested] 27+ messages in thread

* Re: [BUG] NULL pointer deref with rcutorture
  2009-01-07 20:19                                                 ` Eric Sesterhenn
@ 2009-01-07 22:06                                                   ` Paul E. McKenney
  2009-01-07 22:34                                                     ` Eric Sesterhenn
  0 siblings, 1 reply; 27+ messages in thread
From: Paul E. McKenney @ 2009-01-07 22:06 UTC (permalink / raw)
  To: Eric Sesterhenn; +Cc: Kamalesh Babulal, linux-kernel, josh, dipankar

On Wed, Jan 07, 2009 at 09:19:04PM +0100, Eric Sesterhenn wrote:
> * Paul E. McKenney (paulmck@linux.vnet.ibm.com) wrote:
> > On Tue, Jan 06, 2009 at 04:48:09AM -0800, Paul E. McKenney wrote:
> > > On Tue, Jan 06, 2009 at 08:47:07AM +0100, Eric Sesterhenn wrote:
> > > > * Paul E. McKenney (paulmck@linux.vnet.ibm.com) wrote:
> > > > > On Mon, Jan 05, 2009 at 04:29:12PM -0800, Paul E. McKenney wrote:
> > > > > > On Mon, Jan 05, 2009 at 02:18:36PM -0800, Paul E. McKenney wrote:
> > > > > > I will see what I can find from this.
> > > > > 
> > > > > Hello, Eric,
> > > > > 
> > > > > Just for grins, I built the 2.6.28 kernel/rcutorture.c on b58602a.  It
> > > > > insmoded and rmmoded without errors.
> > > > > 
> > > > > Could you please try this on your setup?  I have attached this file in
> > > > > case that helps.
> > > > 
> > > > works like a charm
> > > 
> > > Well, that certainly tells me where the bug lies.  ;-)
> > > 
> > > Thank you for all your hard work on this, I should have a patch soon.
> > 
> > And here it is, against the original version of rcutorture.c (revert
> > all of my previous patches against rcutorture.c).  I am also attaching
> > the resulting rcutorture.c, if that makes things more convenient for you.
> > 
> > Could you please fire this up and see if it fixes the rmmod-time bug
> > that you located?
> 
> it does fix the bug, thanks for your work

Thank you very much for finding this one, bearing with me during the
debug process, and testing it!  I will post the patch with credit to you.

Does this fix all of the problems that you observed, or is there still
some breakage?

							Thanx, Paul

^ permalink raw reply	[flat|nested] 27+ messages in thread

* Re: [BUG] NULL pointer deref with rcutorture
  2009-01-07 22:06                                                   ` Paul E. McKenney
@ 2009-01-07 22:34                                                     ` Eric Sesterhenn
  2009-01-07 22:48                                                       ` Paul E. McKenney
  0 siblings, 1 reply; 27+ messages in thread
From: Eric Sesterhenn @ 2009-01-07 22:34 UTC (permalink / raw)
  To: Paul E. McKenney; +Cc: Kamalesh Babulal, linux-kernel, josh, dipankar

* Paul E. McKenney (paulmck@linux.vnet.ibm.com) wrote:
> On Wed, Jan 07, 2009 at 09:19:04PM +0100, Eric Sesterhenn wrote:
> > * Paul E. McKenney (paulmck@linux.vnet.ibm.com) wrote:
> > > On Tue, Jan 06, 2009 at 04:48:09AM -0800, Paul E. McKenney wrote:
> > > > On Tue, Jan 06, 2009 at 08:47:07AM +0100, Eric Sesterhenn wrote:
> > > > > * Paul E. McKenney (paulmck@linux.vnet.ibm.com) wrote:
> > > > > > On Mon, Jan 05, 2009 at 04:29:12PM -0800, Paul E. McKenney wrote:
> > > > > > > On Mon, Jan 05, 2009 at 02:18:36PM -0800, Paul E. McKenney wrote:
> > > > > > > I will see what I can find from this.
> > > > > > 
> > > > > > Hello, Eric,
> > > > > > 
> > > > > > Just for grins, I built the 2.6.28 kernel/rcutorture.c on b58602a.  It
> > > > > > insmoded and rmmoded without errors.
> > > > > > 
> > > > > > Could you please try this on your setup?  I have attached this file in
> > > > > > case that helps.
> > > > > 
> > > > > works like a charm
> > > > 
> > > > Well, that certainly tells me where the bug lies.  ;-)
> > > > 
> > > > Thank you for all your hard work on this, I should have a patch soon.
> > > 
> > > And here it is, against the original version of rcutorture.c (revert
> > > all of my previous patches against rcutorture.c).  I am also attaching
> > > the resulting rcutorture.c, if that makes things more convenient for you.
> > > 
> > > Could you please fire this up and see if it fixes the rmmod-time bug
> > > that you located?
> > 
> > it does fix the bug, thanks for your work
> 
> Thank you very much for finding this one, bearing with me during the
> debug process, and testing it!  I will post the patch with credit to you.
> 
> Does this fix all of the problems that you observed, or is there still
> some breakage?

It fixes everything i saw. Thanks.

Eric

^ permalink raw reply	[flat|nested] 27+ messages in thread

* Re: [BUG] NULL pointer deref with rcutorture
  2009-01-07 22:34                                                     ` Eric Sesterhenn
@ 2009-01-07 22:48                                                       ` Paul E. McKenney
  0 siblings, 0 replies; 27+ messages in thread
From: Paul E. McKenney @ 2009-01-07 22:48 UTC (permalink / raw)
  To: Eric Sesterhenn; +Cc: Kamalesh Babulal, linux-kernel, josh, dipankar

On Wed, Jan 07, 2009 at 11:34:12PM +0100, Eric Sesterhenn wrote:
> * Paul E. McKenney (paulmck@linux.vnet.ibm.com) wrote:
> > On Wed, Jan 07, 2009 at 09:19:04PM +0100, Eric Sesterhenn wrote:
> > > * Paul E. McKenney (paulmck@linux.vnet.ibm.com) wrote:
> > > > On Tue, Jan 06, 2009 at 04:48:09AM -0800, Paul E. McKenney wrote:
> > > > > On Tue, Jan 06, 2009 at 08:47:07AM +0100, Eric Sesterhenn wrote:
> > > > > > * Paul E. McKenney (paulmck@linux.vnet.ibm.com) wrote:
> > > > > > > On Mon, Jan 05, 2009 at 04:29:12PM -0800, Paul E. McKenney wrote:
> > > > > > > > On Mon, Jan 05, 2009 at 02:18:36PM -0800, Paul E. McKenney wrote:
> > > > > > > > I will see what I can find from this.
> > > > > > > 
> > > > > > > Hello, Eric,
> > > > > > > 
> > > > > > > Just for grins, I built the 2.6.28 kernel/rcutorture.c on b58602a.  It
> > > > > > > insmoded and rmmoded without errors.
> > > > > > > 
> > > > > > > Could you please try this on your setup?  I have attached this file in
> > > > > > > case that helps.
> > > > > > 
> > > > > > works like a charm
> > > > > 
> > > > > Well, that certainly tells me where the bug lies.  ;-)
> > > > > 
> > > > > Thank you for all your hard work on this, I should have a patch soon.
> > > > 
> > > > And here it is, against the original version of rcutorture.c (revert
> > > > all of my previous patches against rcutorture.c).  I am also attaching
> > > > the resulting rcutorture.c, if that makes things more convenient for you.
> > > > 
> > > > Could you please fire this up and see if it fixes the rmmod-time bug
> > > > that you located?
> > > 
> > > it does fix the bug, thanks for your work
> > 
> > Thank you very much for finding this one, bearing with me during the
> > debug process, and testing it!  I will post the patch with credit to you.
> > 
> > Does this fix all of the problems that you observed, or is there still
> > some breakage?
> 
> It fixes everything i saw. Thanks.

Woo-hoo!!!  ;-)

							Thanx, Paul

^ permalink raw reply	[flat|nested] 27+ messages in thread

end of thread, other threads:[~2009-01-07 22:49 UTC | newest]

Thread overview: 27+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2009-01-02 11:18 [BUG] NULL pointer deref with rcutorture Eric Sesterhenn
2009-01-02 17:58 ` Paul E. McKenney
2009-01-02 18:53   ` Kamalesh Babulal
2009-01-02 19:53     ` Paul E. McKenney
2009-01-02 23:12       ` Eric Sesterhenn
2009-01-03  1:57         ` Paul E. McKenney
     [not found]           ` <20090103094003.GA6149@alice>
     [not found]             ` <20090104013254.GG6958@linux.vnet.ibm.com>
2009-01-04 14:57               ` Eric Sesterhenn
2009-01-04 21:13                 ` Paul E. McKenney
2009-01-04 23:38                   ` Eric Sesterhenn
2009-01-05  2:28                     ` Paul E. McKenney
2009-01-05 12:14                       ` Eric Sesterhenn
2009-01-05 18:00                         ` Paul E. McKenney
2009-01-05 18:56                           ` Eric Sesterhenn
2009-01-05 19:36                             ` Paul E. McKenney
2009-01-05 20:01                               ` Eric Sesterhenn
2009-01-05 20:16                                 ` Paul E. McKenney
2009-01-05 20:31                                   ` Eric Sesterhenn
2009-01-05 22:18                                     ` Paul E. McKenney
2009-01-06  0:29                                       ` Paul E. McKenney
2009-01-06  2:15                                         ` Paul E. McKenney
2009-01-06  7:47                                           ` Eric Sesterhenn
2009-01-06 12:48                                             ` Paul E. McKenney
2009-01-07 19:46                                               ` Paul E. McKenney
2009-01-07 20:19                                                 ` Eric Sesterhenn
2009-01-07 22:06                                                   ` Paul E. McKenney
2009-01-07 22:34                                                     ` Eric Sesterhenn
2009-01-07 22:48                                                       ` Paul E. McKenney

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox