linux-mm.kvack.org archive mirror
 help / color / mirror / Atom feed
* [PATCH] exit: fix oops in sync_mm_rss
@ 2010-03-16 17:08 Michael S. Tsirkin
  2010-03-16 17:51 ` Andrea Arcangeli
                   ` (4 more replies)
  0 siblings, 5 replies; 17+ messages in thread
From: Michael S. Tsirkin @ 2010-03-16 17:08 UTC (permalink / raw)
  To: cl, lee.schermerhorn, rientjes
  Cc: Andrew Morton, Hugh Dickins, Rik van Riel, KAMEZAWA Hiroyuki,
	Minchan Kim, Andrea Arcangeli, David S. Miller, linux-mm,
	linux-kernel

In 2.6.34-rc1, removing vhost_net module causes an oops in sync_mm_rss
(called from do_exit) when workqueue is destroyed. This does not happen on
net-next, or with vhost on top of to 2.6.33.

The issue seems to be introduced by
34e55232e59f7b19050267a05ff1226e5cd122a5: that commit added function
sync_mm_rss that is passed task->mm, and dereferences it without
checking. If task is a kernel thread, mm might be NULL.
I think this might also happen e.g. with aio.

This patch fixes the oops by calling sync_mm_rss when task->mm
is set to NULL. I also added BUG_ON to detect any other cases
where counters get incremented while mm is NULL.

The oops I observed looks like this:

BUG: unable to handle kernel NULL pointer dereference at 00000000000002a8
IP: [<ffffffff810b436d>] sync_mm_rss+0x33/0x6f
PGD 0
Oops: 0002 [#1] SMP
last sysfs file: /sys/devices/system/cpu/cpu7/cache/index2/shared_cpu_map
CPU 2
Modules linked in: vhost_net(-) tun bridge stp sunrpc ipv6 cpufreq_ondemand acpi_cpufreq freq_table kvm_intel kvm i5000_edac edac_core rtc_cmos bnx2 button i2c_i801 i2c_core rtc_core e1000e sg joydev ide_cd_mod serio_raw pcspkr rtc_lib cdrom virtio_net virtio_blk virtio_pci virtio_ring virtio af_packet e1000 shpchp aacraid uhci_hcd ohci_hcd ehci_hcd [last unloaded: microcode]

Pid: 2046, comm: vhost Not tainted 2.6.34-rc1-vhost #25 System Planar/IBM System x3550 -[7978B3G]-
RIP: 0010:[<ffffffff810b436d>]  [<ffffffff810b436d>] sync_mm_rss+0x33/0x6f
RSP: 0018:ffff8802379b7e60  EFLAGS: 00010202
RAX: 0000000000000008 RBX: ffff88023f2390c0 RCX: 0000000000000000
RDX: ffff88023f2396b0 RSI: 0000000000000000 RDI: ffff88023f2390c0
RBP: ffff8802379b7e60 R08: 0000000000000000 R09: 0000000000000000
R10: ffff88023aecfbc0 R11: 0000000000013240 R12: 0000000000000000
R13: ffffffff81051a6c R14: ffffe8ffffc0f540 R15: 0000000000000000
FS:  0000000000000000(0000) GS:ffff880001e80000(0000) knlGS:0000000000000000
CS:  0010 DS: 0000 ES: 0000 CR0: 000000008005003b
CR2: 00000000000002a8 CR3: 000000023af23000 CR4: 00000000000406e0
DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
DR3: 0000000000000000 DR6: 00000000ffff0ff0 DR7: 0000000000000400
Process vhost (pid: 2046, threadinfo ffff8802379b6000, task ffff88023f2390c0)
Stack:
 ffff8802379b7ee0 ffffffff81040687 ffffe8ffffc0f558 ffffffffa00a3e2d
<0> 0000000000000000 ffff88023f2390c0 ffffffff81055817 ffff8802379b7e98
<0> ffff8802379b7e98 0000000100000286 ffff8802379b7ee0 ffff88023ad47d78
Call Trace:
 [<ffffffff81040687>] do_exit+0x147/0x6c4
 [<ffffffffa00a3e2d>] ? handle_rx_net+0x0/0x17 [vhost_net]
 [<ffffffff81055817>] ? autoremove_wake_function+0x0/0x39
 [<ffffffff81051a6c>] ? worker_thread+0x0/0x229
 [<ffffffff810553c9>] kthreadd+0x0/0xf2
 [<ffffffff810038d4>] kernel_thread_helper+0x4/0x10
 [<ffffffff81055342>] ? kthread+0x0/0x87
 [<ffffffff810038d0>] ? kernel_thread_helper+0x0/0x10
Code: 00 8b 87 6c 02 00 00 85 c0 74 14 48 98 f0 48 01 86 a0 02 00 00 c7 87 6c 02 00 00 00 00 00 00 8b 87 70 02 00 00 85 c0 74 14 48 98 <f0> 48 01 86 a8 02 00 00 c7 87 70 02 00 00 00 00 00 00 8b 87 74
RIP  [<ffffffff810b436d>] sync_mm_rss+0x33/0x6f
 RSP <ffff8802379b7e60>
CR2: 00000000000002a8
---[ end trace 41603ba922beddd2 ]---
Fixing recursive fault but reboot is needed!

(note: handle_rx_net is a work item using workqueue in question).
sync_mm_rss+0x33/0x6f gave me a hint. I also tried reverting
34e55232e59f7b19050267a05ff1226e5cd122a5 and the oops goes away.

The module in question calls use_mm and later unuse_mm from a kernel
thread.  It is when this kernel thread is destroyed that the crash
happens.

Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
---
 mm/memory.c      |    1 +
 mm/mmu_context.c |    1 +
 2 files changed, 2 insertions(+), 0 deletions(-)

diff --git a/mm/memory.c b/mm/memory.c
index d1153e3..27022b3 100644
--- a/mm/memory.c
+++ b/mm/memory.c
@@ -130,6 +130,7 @@ void __sync_task_rss_stat(struct task_struct *task, struct mm_struct *mm)
 
 	for (i = 0; i < NR_MM_COUNTERS; i++) {
 		if (task->rss_stat.count[i]) {
+			BUG_ON(!mm);
 			add_mm_counter(mm, i, task->rss_stat.count[i]);
 			task->rss_stat.count[i] = 0;
 		}
diff --git a/mm/mmu_context.c b/mm/mmu_context.c
index 0777654..9e82e93 100644
--- a/mm/mmu_context.c
+++ b/mm/mmu_context.c
@@ -53,6 +53,7 @@ void unuse_mm(struct mm_struct *mm)
 	struct task_struct *tsk = current;
 
 	task_lock(tsk);
+	sync_mm_rss(tsk, mm);
 	tsk->mm = NULL;
 	/* active_mm is still 'mm' */
 	enter_lazy_tlb(mm, tsk);
-- 
1.7.0.18.g0d53a5

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply related	[flat|nested] 17+ messages in thread

* Re: [PATCH] exit: fix oops in sync_mm_rss
  2010-03-16 17:08 [PATCH] exit: fix oops in sync_mm_rss Michael S. Tsirkin
@ 2010-03-16 17:51 ` Andrea Arcangeli
  2010-03-16 17:52 ` Rik van Riel
                   ` (3 subsequent siblings)
  4 siblings, 0 replies; 17+ messages in thread
From: Andrea Arcangeli @ 2010-03-16 17:51 UTC (permalink / raw)
  To: Michael S. Tsirkin
  Cc: cl, lee.schermerhorn, rientjes, Andrew Morton, Hugh Dickins,
	Rik van Riel, KAMEZAWA Hiroyuki, Minchan Kim, David S. Miller,
	linux-mm, linux-kernel

On Tue, Mar 16, 2010 at 07:08:08PM +0200, Michael S. Tsirkin wrote:
> The module in question calls use_mm and later unuse_mm from a kernel
> thread.  It is when this kernel thread is destroyed that the crash
> happens.

Looks good to me so the stats are transferred to mm before we lose
track of it.

Ack!

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: [PATCH] exit: fix oops in sync_mm_rss
  2010-03-16 17:08 [PATCH] exit: fix oops in sync_mm_rss Michael S. Tsirkin
  2010-03-16 17:51 ` Andrea Arcangeli
@ 2010-03-16 17:52 ` Rik van Riel
  2010-03-16 23:41 ` KAMEZAWA Hiroyuki
                   ` (2 subsequent siblings)
  4 siblings, 0 replies; 17+ messages in thread
From: Rik van Riel @ 2010-03-16 17:52 UTC (permalink / raw)
  To: Michael S. Tsirkin
  Cc: cl, lee.schermerhorn, rientjes, Andrew Morton, Hugh Dickins,
	KAMEZAWA Hiroyuki, Minchan Kim, Andrea Arcangeli, David S. Miller,
	linux-mm, linux-kernel

On 03/16/2010 01:08 PM, Michael S. Tsirkin wrote:

> (note: handle_rx_net is a work item using workqueue in question).
> sync_mm_rss+0x33/0x6f gave me a hint. I also tried reverting
> 34e55232e59f7b19050267a05ff1226e5cd122a5 and the oops goes away.
>
> The module in question calls use_mm and later unuse_mm from a kernel
> thread.  It is when this kernel thread is destroyed that the crash
> happens.
>
> Signed-off-by: Michael S. Tsirkin<mst@redhat.com>

Reviewed-by: Rik van Riel <riel@redhat.com>

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: [PATCH] exit: fix oops in sync_mm_rss
  2010-03-16 17:08 [PATCH] exit: fix oops in sync_mm_rss Michael S. Tsirkin
  2010-03-16 17:51 ` Andrea Arcangeli
  2010-03-16 17:52 ` Rik van Riel
@ 2010-03-16 23:41 ` KAMEZAWA Hiroyuki
  2010-03-17  2:26 ` Minchan Kim
  2010-03-30 20:56 ` Andrew Morton
  4 siblings, 0 replies; 17+ messages in thread
From: KAMEZAWA Hiroyuki @ 2010-03-16 23:41 UTC (permalink / raw)
  To: Michael S. Tsirkin
  Cc: cl, lee.schermerhorn, rientjes, Andrew Morton, Hugh Dickins,
	Rik van Riel, Minchan Kim, Andrea Arcangeli, David S. Miller,
	linux-mm, linux-kernel

On Tue, 16 Mar 2010 19:08:08 +0200
"Michael S. Tsirkin" <mst@redhat.com> wrote:

> In 2.6.34-rc1, removing vhost_net module causes an oops in sync_mm_rss
> (called from do_exit) when workqueue is destroyed. This does not happen on
> net-next, or with vhost on top of to 2.6.33.
> 
> The issue seems to be introduced by
> 34e55232e59f7b19050267a05ff1226e5cd122a5: that commit added function
> sync_mm_rss that is passed task->mm, and dereferences it without
> checking. If task is a kernel thread, mm might be NULL.
> I think this might also happen e.g. with aio.
> 
> This patch fixes the oops by calling sync_mm_rss when task->mm
> is set to NULL. I also added BUG_ON to detect any other cases
> where counters get incremented while mm is NULL.
> 
> The oops I observed looks like this:
> 
> BUG: unable to handle kernel NULL pointer dereference at 00000000000002a8
> IP: [<ffffffff810b436d>] sync_mm_rss+0x33/0x6f
> PGD 0
> Oops: 0002 [#1] SMP
> last sysfs file: /sys/devices/system/cpu/cpu7/cache/index2/shared_cpu_map
> CPU 2
> Modules linked in: vhost_net(-) tun bridge stp sunrpc ipv6 cpufreq_ondemand acpi_cpufreq freq_table kvm_intel kvm i5000_edac edac_core rtc_cmos bnx2 button i2c_i801 i2c_core rtc_core e1000e sg joydev ide_cd_mod serio_raw pcspkr rtc_lib cdrom virtio_net virtio_blk virtio_pci virtio_ring virtio af_packet e1000 shpchp aacraid uhci_hcd ohci_hcd ehci_hcd [last unloaded: microcode]
> 
> Pid: 2046, comm: vhost Not tainted 2.6.34-rc1-vhost #25 System Planar/IBM System x3550 -[7978B3G]-
> RIP: 0010:[<ffffffff810b436d>]  [<ffffffff810b436d>] sync_mm_rss+0x33/0x6f
> RSP: 0018:ffff8802379b7e60  EFLAGS: 00010202
> RAX: 0000000000000008 RBX: ffff88023f2390c0 RCX: 0000000000000000
> RDX: ffff88023f2396b0 RSI: 0000000000000000 RDI: ffff88023f2390c0
> RBP: ffff8802379b7e60 R08: 0000000000000000 R09: 0000000000000000
> R10: ffff88023aecfbc0 R11: 0000000000013240 R12: 0000000000000000
> R13: ffffffff81051a6c R14: ffffe8ffffc0f540 R15: 0000000000000000
> FS:  0000000000000000(0000) GS:ffff880001e80000(0000) knlGS:0000000000000000
> CS:  0010 DS: 0000 ES: 0000 CR0: 000000008005003b
> CR2: 00000000000002a8 CR3: 000000023af23000 CR4: 00000000000406e0
> DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
> DR3: 0000000000000000 DR6: 00000000ffff0ff0 DR7: 0000000000000400
> Process vhost (pid: 2046, threadinfo ffff8802379b6000, task ffff88023f2390c0)
> Stack:
>  ffff8802379b7ee0 ffffffff81040687 ffffe8ffffc0f558 ffffffffa00a3e2d
> <0> 0000000000000000 ffff88023f2390c0 ffffffff81055817 ffff8802379b7e98
> <0> ffff8802379b7e98 0000000100000286 ffff8802379b7ee0 ffff88023ad47d78
> Call Trace:
>  [<ffffffff81040687>] do_exit+0x147/0x6c4
>  [<ffffffffa00a3e2d>] ? handle_rx_net+0x0/0x17 [vhost_net]
>  [<ffffffff81055817>] ? autoremove_wake_function+0x0/0x39
>  [<ffffffff81051a6c>] ? worker_thread+0x0/0x229
>  [<ffffffff810553c9>] kthreadd+0x0/0xf2
>  [<ffffffff810038d4>] kernel_thread_helper+0x4/0x10
>  [<ffffffff81055342>] ? kthread+0x0/0x87
>  [<ffffffff810038d0>] ? kernel_thread_helper+0x0/0x10
> Code: 00 8b 87 6c 02 00 00 85 c0 74 14 48 98 f0 48 01 86 a0 02 00 00 c7 87 6c 02 00 00 00 00 00 00 8b 87 70 02 00 00 85 c0 74 14 48 98 <f0> 48 01 86 a8 02 00 00 c7 87 70 02 00 00 00 00 00 00 8b 87 74
> RIP  [<ffffffff810b436d>] sync_mm_rss+0x33/0x6f
>  RSP <ffff8802379b7e60>
> CR2: 00000000000002a8
> ---[ end trace 41603ba922beddd2 ]---
> Fixing recursive fault but reboot is needed!
> 
> (note: handle_rx_net is a work item using workqueue in question).
> sync_mm_rss+0x33/0x6f gave me a hint. I also tried reverting
> 34e55232e59f7b19050267a05ff1226e5cd122a5 and the oops goes away.
> 
> The module in question calls use_mm and later unuse_mm from a kernel
> thread.  It is when this kernel thread is destroyed that the crash
> happens.
> 
> Signed-off-by: Michael S. Tsirkin <mst@redhat.com>

Thank you very much.

Reviewed-by: KAMEZAWA Hiroyuki <kamezawa.hiroyu@jp.fujitsu.com>


> ---
>  mm/memory.c      |    1 +
>  mm/mmu_context.c |    1 +
>  2 files changed, 2 insertions(+), 0 deletions(-)
> 
> diff --git a/mm/memory.c b/mm/memory.c
> index d1153e3..27022b3 100644
> --- a/mm/memory.c
> +++ b/mm/memory.c
> @@ -130,6 +130,7 @@ void __sync_task_rss_stat(struct task_struct *task, struct mm_struct *mm)
>  
>  	for (i = 0; i < NR_MM_COUNTERS; i++) {
>  		if (task->rss_stat.count[i]) {
> +			BUG_ON(!mm);
>  			add_mm_counter(mm, i, task->rss_stat.count[i]);
>  			task->rss_stat.count[i] = 0;
>  		}
> diff --git a/mm/mmu_context.c b/mm/mmu_context.c
> index 0777654..9e82e93 100644
> --- a/mm/mmu_context.c
> +++ b/mm/mmu_context.c
> @@ -53,6 +53,7 @@ void unuse_mm(struct mm_struct *mm)
>  	struct task_struct *tsk = current;
>  
>  	task_lock(tsk);
> +	sync_mm_rss(tsk, mm);
>  	tsk->mm = NULL;
>  	/* active_mm is still 'mm' */
>  	enter_lazy_tlb(mm, tsk);
> -- 
> 1.7.0.18.g0d53a5
> 

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: [PATCH] exit: fix oops in sync_mm_rss
  2010-03-16 17:08 [PATCH] exit: fix oops in sync_mm_rss Michael S. Tsirkin
                   ` (2 preceding siblings ...)
  2010-03-16 23:41 ` KAMEZAWA Hiroyuki
@ 2010-03-17  2:26 ` Minchan Kim
  2010-03-30 20:56 ` Andrew Morton
  4 siblings, 0 replies; 17+ messages in thread
From: Minchan Kim @ 2010-03-17  2:26 UTC (permalink / raw)
  To: Michael S. Tsirkin
  Cc: cl, lee.schermerhorn, rientjes, Andrew Morton, Hugh Dickins,
	Rik van Riel, KAMEZAWA Hiroyuki, Andrea Arcangeli,
	David S. Miller, linux-mm, linux-kernel

On Wed, Mar 17, 2010 at 2:08 AM, Michael S. Tsirkin <mst@redhat.com> wrote:
> In 2.6.34-rc1, removing vhost_net module causes an oops in sync_mm_rss
> (called from do_exit) when workqueue is destroyed. This does not happen on
> net-next, or with vhost on top of to 2.6.33.
>
> The issue seems to be introduced by
> 34e55232e59f7b19050267a05ff1226e5cd122a5: that commit added function
> sync_mm_rss that is passed task->mm, and dereferences it without
> checking. If task is a kernel thread, mm might be NULL.
> I think this might also happen e.g. with aio.
>
> This patch fixes the oops by calling sync_mm_rss when task->mm
> is set to NULL. I also added BUG_ON to detect any other cases
> where counters get incremented while mm is NULL.
>
> The oops I observed looks like this:
>
> BUG: unable to handle kernel NULL pointer dereference at 00000000000002a8
> IP: [<ffffffff810b436d>] sync_mm_rss+0x33/0x6f
> PGD 0
> Oops: 0002 [#1] SMP
> last sysfs file: /sys/devices/system/cpu/cpu7/cache/index2/shared_cpu_map
> CPU 2
> Modules linked in: vhost_net(-) tun bridge stp sunrpc ipv6 cpufreq_ondemand acpi_cpufreq freq_table kvm_intel kvm i5000_edac edac_core rtc_cmos bnx2 button i2c_i801 i2c_core rtc_core e1000e sg joydev ide_cd_mod serio_raw pcspkr rtc_lib cdrom virtio_net virtio_blk virtio_pci virtio_ring virtio af_packet e1000 shpchp aacraid uhci_hcd ohci_hcd ehci_hcd [last unloaded: microcode]
>
> Pid: 2046, comm: vhost Not tainted 2.6.34-rc1-vhost #25 System Planar/IBM System x3550 -[7978B3G]-
> RIP: 0010:[<ffffffff810b436d>]  [<ffffffff810b436d>] sync_mm_rss+0x33/0x6f
> RSP: 0018:ffff8802379b7e60  EFLAGS: 00010202
> RAX: 0000000000000008 RBX: ffff88023f2390c0 RCX: 0000000000000000
> RDX: ffff88023f2396b0 RSI: 0000000000000000 RDI: ffff88023f2390c0
> RBP: ffff8802379b7e60 R08: 0000000000000000 R09: 0000000000000000
> R10: ffff88023aecfbc0 R11: 0000000000013240 R12: 0000000000000000
> R13: ffffffff81051a6c R14: ffffe8ffffc0f540 R15: 0000000000000000
> FS:  0000000000000000(0000) GS:ffff880001e80000(0000) knlGS:0000000000000000
> CS:  0010 DS: 0000 ES: 0000 CR0: 000000008005003b
> CR2: 00000000000002a8 CR3: 000000023af23000 CR4: 00000000000406e0
> DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
> DR3: 0000000000000000 DR6: 00000000ffff0ff0 DR7: 0000000000000400
> Process vhost (pid: 2046, threadinfo ffff8802379b6000, task ffff88023f2390c0)
> Stack:
>  ffff8802379b7ee0 ffffffff81040687 ffffe8ffffc0f558 ffffffffa00a3e2d
> <0> 0000000000000000 ffff88023f2390c0 ffffffff81055817 ffff8802379b7e98
> <0> ffff8802379b7e98 0000000100000286 ffff8802379b7ee0 ffff88023ad47d78
> Call Trace:
>  [<ffffffff81040687>] do_exit+0x147/0x6c4
>  [<ffffffffa00a3e2d>] ? handle_rx_net+0x0/0x17 [vhost_net]
>  [<ffffffff81055817>] ? autoremove_wake_function+0x0/0x39
>  [<ffffffff81051a6c>] ? worker_thread+0x0/0x229
>  [<ffffffff810553c9>] kthreadd+0x0/0xf2
>  [<ffffffff810038d4>] kernel_thread_helper+0x4/0x10
>  [<ffffffff81055342>] ? kthread+0x0/0x87
>  [<ffffffff810038d0>] ? kernel_thread_helper+0x0/0x10
> Code: 00 8b 87 6c 02 00 00 85 c0 74 14 48 98 f0 48 01 86 a0 02 00 00 c7 87 6c 02 00 00 00 00 00 00 8b 87 70 02 00 00 85 c0 74 14 48 98 <f0> 48 01 86 a8 02 00 00 c7 87 70 02 00 00 00 00 00 00 8b 87 74
> RIP  [<ffffffff810b436d>] sync_mm_rss+0x33/0x6f
>  RSP <ffff8802379b7e60>
> CR2: 00000000000002a8
> ---[ end trace 41603ba922beddd2 ]---
> Fixing recursive fault but reboot is needed!
>
> (note: handle_rx_net is a work item using workqueue in question).
> sync_mm_rss+0x33/0x6f gave me a hint. I also tried reverting
> 34e55232e59f7b19050267a05ff1226e5cd122a5 and the oops goes away.
>
> The module in question calls use_mm and later unuse_mm from a kernel
> thread.  It is when this kernel thread is destroyed that the crash
> happens.
>
> Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Reviewed-by: Minchan Kim <minchan.kim@gmail.com>

Nice catch.

-- 
Kind regards,
Minchan Kim

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: [PATCH] exit: fix oops in sync_mm_rss
  2010-03-16 17:08 [PATCH] exit: fix oops in sync_mm_rss Michael S. Tsirkin
                   ` (3 preceding siblings ...)
  2010-03-17  2:26 ` Minchan Kim
@ 2010-03-30 20:56 ` Andrew Morton
  2010-03-31  0:28   ` KAMEZAWA Hiroyuki
  4 siblings, 1 reply; 17+ messages in thread
From: Andrew Morton @ 2010-03-30 20:56 UTC (permalink / raw)
  To: Michael S. Tsirkin
  Cc: cl, lee.schermerhorn, rientjes, Hugh Dickins, Rik van Riel,
	KAMEZAWA Hiroyuki, Minchan Kim, Andrea Arcangeli, David S. Miller,
	linux-mm, linux-kernel, Troels Liebe Bentsen, linux-bluetooth

On Tue, 16 Mar 2010 19:08:08 +0200
"Michael S. Tsirkin" <mst@redhat.com> wrote:

> In 2.6.34-rc1, removing vhost_net module causes an oops in sync_mm_rss
> (called from do_exit) when workqueue is destroyed. This does not happen on
> net-next, or with vhost on top of to 2.6.33.
> 
> The issue seems to be introduced by
> 34e55232e59f7b19050267a05ff1226e5cd122a5: that commit added function
> sync_mm_rss that is passed task->mm, and dereferences it without
> checking. If task is a kernel thread, mm might be NULL.
> I think this might also happen e.g. with aio.
> 
> This patch fixes the oops by calling sync_mm_rss when task->mm
> is set to NULL. I also added BUG_ON to detect any other cases
> where counters get incremented while mm is NULL.
> 
> The oops I observed looks like this:
> 
> BUG: unable to handle kernel NULL pointer dereference at 00000000000002a8
> IP: [<ffffffff810b436d>] sync_mm_rss+0x33/0x6f
> PGD 0
> Oops: 0002 [#1] SMP
> last sysfs file: /sys/devices/system/cpu/cpu7/cache/index2/shared_cpu_map
> CPU 2
> Modules linked in: vhost_net(-) tun bridge stp sunrpc ipv6 cpufreq_ondemand acpi_cpufreq freq_table kvm_intel kvm i5000_edac edac_core rtc_cmos bnx2 button i2c_i801 i2c_core rtc_core e1000e sg joydev ide_cd_mod serio_raw pcspkr rtc_lib cdrom virtio_net virtio_blk virtio_pci virtio_ring virtio af_packet e1000 shpchp aacraid uhci_hcd ohci_hcd ehci_hcd [last unloaded: microcode]
> 
> Pid: 2046, comm: vhost Not tainted 2.6.34-rc1-vhost #25 System Planar/IBM System x3550 -[7978B3G]-
> RIP: 0010:[<ffffffff810b436d>]  [<ffffffff810b436d>] sync_mm_rss+0x33/0x6f
> RSP: 0018:ffff8802379b7e60  EFLAGS: 00010202
> RAX: 0000000000000008 RBX: ffff88023f2390c0 RCX: 0000000000000000
> RDX: ffff88023f2396b0 RSI: 0000000000000000 RDI: ffff88023f2390c0
> RBP: ffff8802379b7e60 R08: 0000000000000000 R09: 0000000000000000
> R10: ffff88023aecfbc0 R11: 0000000000013240 R12: 0000000000000000
> R13: ffffffff81051a6c R14: ffffe8ffffc0f540 R15: 0000000000000000
> FS:  0000000000000000(0000) GS:ffff880001e80000(0000) knlGS:0000000000000000
> CS:  0010 DS: 0000 ES: 0000 CR0: 000000008005003b
> CR2: 00000000000002a8 CR3: 000000023af23000 CR4: 00000000000406e0
> DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
> DR3: 0000000000000000 DR6: 00000000ffff0ff0 DR7: 0000000000000400
> Process vhost (pid: 2046, threadinfo ffff8802379b6000, task ffff88023f2390c0)
> Stack:
>  ffff8802379b7ee0 ffffffff81040687 ffffe8ffffc0f558 ffffffffa00a3e2d
> <0> 0000000000000000 ffff88023f2390c0 ffffffff81055817 ffff8802379b7e98
> <0> ffff8802379b7e98 0000000100000286 ffff8802379b7ee0 ffff88023ad47d78
> Call Trace:
>  [<ffffffff81040687>] do_exit+0x147/0x6c4
>  [<ffffffffa00a3e2d>] ? handle_rx_net+0x0/0x17 [vhost_net]
>  [<ffffffff81055817>] ? autoremove_wake_function+0x0/0x39
>  [<ffffffff81051a6c>] ? worker_thread+0x0/0x229
>  [<ffffffff810553c9>] kthreadd+0x0/0xf2
>  [<ffffffff810038d4>] kernel_thread_helper+0x4/0x10
>  [<ffffffff81055342>] ? kthread+0x0/0x87
>  [<ffffffff810038d0>] ? kernel_thread_helper+0x0/0x10
> Code: 00 8b 87 6c 02 00 00 85 c0 74 14 48 98 f0 48 01 86 a0 02 00 00 c7 87 6c 02 00 00 00 00 00 00 8b 87 70 02 00 00 85 c0 74 14 48 98 <f0> 48 01 86 a8 02 00 00 c7 87 70 02 00 00 00 00 00 00 8b 87 74
> RIP  [<ffffffff810b436d>] sync_mm_rss+0x33/0x6f
>  RSP <ffff8802379b7e60>
> CR2: 00000000000002a8
> ---[ end trace 41603ba922beddd2 ]---
> Fixing recursive fault but reboot is needed!
> 
> (note: handle_rx_net is a work item using workqueue in question).
> sync_mm_rss+0x33/0x6f gave me a hint. I also tried reverting
> 34e55232e59f7b19050267a05ff1226e5cd122a5 and the oops goes away.
> 
> The module in question calls use_mm and later unuse_mm from a kernel
> thread.  It is when this kernel thread is destroyed that the crash
> happens.
> 
> Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
> ---
>  mm/memory.c      |    1 +
>  mm/mmu_context.c |    1 +
>  2 files changed, 2 insertions(+), 0 deletions(-)
> 
> diff --git a/mm/memory.c b/mm/memory.c
> index d1153e3..27022b3 100644
> --- a/mm/memory.c
> +++ b/mm/memory.c
> @@ -130,6 +130,7 @@ void __sync_task_rss_stat(struct task_struct *task, struct mm_struct *mm)
>  
>  	for (i = 0; i < NR_MM_COUNTERS; i++) {
>  		if (task->rss_stat.count[i]) {
> +			BUG_ON(!mm);
>  			add_mm_counter(mm, i, task->rss_stat.count[i]);
>  			task->rss_stat.count[i] = 0;
>  		}
> diff --git a/mm/mmu_context.c b/mm/mmu_context.c
> index 0777654..9e82e93 100644
> --- a/mm/mmu_context.c
> +++ b/mm/mmu_context.c
> @@ -53,6 +53,7 @@ void unuse_mm(struct mm_struct *mm)
>  	struct task_struct *tsk = current;
>  
>  	task_lock(tsk);
> +	sync_mm_rss(tsk, mm);
>  	tsk->mm = NULL;
>  	/* active_mm is still 'mm' */
>  	enter_lazy_tlb(mm, tsk);

That new BUG_ON() is triggering in Troels's machine when a bluetooth
keyboard is enabled or disabled.  See
(https://bugzilla.kernel.org/show_bug.cgi?id=15648.

I guess the question is: how did a kernel thread get a non-zero
task->rss_stat.count[i]?  If that's expected and OK then we will need
to take some kernel-thread-avoidance action there.

Could whoever fixes this please also make __sync_task_rss_stat()
static.

I'll toss this over to Rafael/Maciej for tracking as a post-2.6.33
regression.

Thanks.

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: [PATCH] exit: fix oops in sync_mm_rss
  2010-03-31  0:28   ` KAMEZAWA Hiroyuki
@ 2010-03-30 21:37     ` Andrew Morton
  2010-03-31  0:41       ` KAMEZAWA Hiroyuki
  0 siblings, 1 reply; 17+ messages in thread
From: Andrew Morton @ 2010-03-30 21:37 UTC (permalink / raw)
  To: KAMEZAWA Hiroyuki
  Cc: Michael S. Tsirkin, cl, lee.schermerhorn, rientjes, Hugh Dickins,
	Rik van Riel, Minchan Kim, Andrea Arcangeli, David S. Miller,
	linux-mm, linux-kernel, Troels Liebe Bentsen, linux-bluetooth

On Wed, 31 Mar 2010 09:28:15 +0900 KAMEZAWA Hiroyuki <kamezawa.hiroyu@jp.fujitsu.com> wrote:

> On Tue, 30 Mar 2010 13:56:34 -0700
> Andrew Morton <akpm@linux-foundation.org> wrote:
> 
> > That new BUG_ON() is triggering in Troels's machine when a bluetooth
> > keyboard is enabled or disabled.  See
> > (https://bugzilla.kernel.org/show_bug.cgi?id=15648.
> > 
> > I guess the question is: how did a kernel thread get a non-zero
> > task->rss_stat.count[i]?  If that's expected and OK then we will need
> > to take some kernel-thread-avoidance action there.
> > 
> It seems my fault that it's not initialized to be 0 at do_fork(), copy_process.
> 
> About do_exit, do_exit() does this check. So, tsk->mm can be NULL.
> 
>  949         if (group_dead) {
>  950                 hrtimer_cancel(&tsk->signal->real_timer);
>  951                 exit_itimers(tsk->signal);
>  952                 if (tsk->mm)
>  953                         setmax_mm_hiwater_rss(&tsk->signal->maxrss, tsk->mm);
>  954         }
> 
> > Could whoever fixes this please also make __sync_task_rss_stat()
> > static.
> > 
> Ah, yes. I should do so.
> 
> > I'll toss this over to Rafael/Maciej for tracking as a post-2.6.33
> > regression.
> > 
> > Thanks.
> > 
> 
> 
> ==
> 
> task->rss_stat wasn't initialized to 0 at copy_process().
> at exit, tsk->mm may be NULL.
> And __sync_task_rss_stat() should be static.
> 
> Signed-off-by: KAMEZAWA Hiroyuki <kamezawa.hiroyu@jp.fujitsu.com>
> ---
>  kernel/exit.c |    3 ++-
>  kernel/fork.c |    3 +++
>  mm/memory.c   |    2 +-
>  3 files changed, 6 insertions(+), 2 deletions(-)
> 
> Index: mmotm-2.6.34-Mar24/kernel/exit.c
> ===================================================================
> --- mmotm-2.6.34-Mar24.orig/kernel/exit.c
> +++ mmotm-2.6.34-Mar24/kernel/exit.c
> @@ -950,7 +950,8 @@ NORET_TYPE void do_exit(long code)
>  
>  	acct_update_integrals(tsk);
>  	/* sync mm's RSS info before statistics gathering */
> -	sync_mm_rss(tsk, tsk->mm);
> +	if (tsk->mm)
> +		sync_mm_rss(tsk, tsk->mm);
>  	group_dead = atomic_dec_and_test(&tsk->signal->live);
>  	if (group_dead) {
>  		hrtimer_cancel(&tsk->signal->real_timer);
> Index: mmotm-2.6.34-Mar24/mm/memory.c
> ===================================================================
> --- mmotm-2.6.34-Mar24.orig/mm/memory.c
> +++ mmotm-2.6.34-Mar24/mm/memory.c
> @@ -124,7 +124,7 @@ core_initcall(init_zero_pfn);
>  
>  #if defined(SPLIT_RSS_COUNTING)
>  
> -void __sync_task_rss_stat(struct task_struct *task, struct mm_struct *mm)
> +static void __sync_task_rss_stat(struct task_struct *task, struct mm_struct *mm)
>  {
>  	int i;
>  
> Index: mmotm-2.6.34-Mar24/kernel/fork.c
> ===================================================================
> --- mmotm-2.6.34-Mar24.orig/kernel/fork.c
> +++ mmotm-2.6.34-Mar24/kernel/fork.c
> @@ -1060,6 +1060,9 @@ static struct task_struct *copy_process(
>  	p->prev_utime = cputime_zero;
>  	p->prev_stime = cputime_zero;
>  #endif
> +#if defined(SPLIT_RSS_COUNTING)
> +	memset(&p->rss_stat, 0, sizeof(p->rss_stat));
> +#endif
>  
>  	p->default_timer_slack_ns = current->timer_slack_ns;

OK, so the kenrel thread inherited a non-zero rss_stat from a userspace
parent?

With this fixed, the test for non-zero tsk->mm is't really needed in
do_exit(), is it?  I guess it makes sense though - sync_mm_rss() only
really works for kernel threads by luck..

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: [PATCH] exit: fix oops in sync_mm_rss
  2010-03-31  0:41       ` KAMEZAWA Hiroyuki
@ 2010-03-30 22:22         ` Andrew Morton
  2010-03-31  1:27           ` KAMEZAWA Hiroyuki
  2010-03-31  1:57         ` Minchan Kim
  1 sibling, 1 reply; 17+ messages in thread
From: Andrew Morton @ 2010-03-30 22:22 UTC (permalink / raw)
  To: KAMEZAWA Hiroyuki
  Cc: Michael S. Tsirkin, cl, lee.schermerhorn, rientjes, Hugh Dickins,
	Rik van Riel, Minchan Kim, Andrea Arcangeli, David S. Miller,
	linux-mm, linux-kernel, Troels Liebe Bentsen, linux-bluetooth

On Wed, 31 Mar 2010 09:41:24 +0900 KAMEZAWA Hiroyuki <kamezawa.hiroyu@jp.fujitsu.com> wrote:

> > With this fixed, the test for non-zero tsk->mm is't really needed in
> > do_exit(), is it?  I guess it makes sense though - sync_mm_rss() only
> > really works for kernel threads by luck..
> 
> At first, I considered so, too. But I changed my mind to show
> "we know tsk->mm can be NULL here!" by code. 
> Because __sync_mm_rss_stat() has BUG_ON(!mm), the code reader will think
> tsk->mm shouldn't be NULL always.
> 
> Doesn't make sense ?

uh, not really ;)


I think we should do this too:

--- a/mm/memory.c~exit-fix-oops-in-sync_mm_rss-fix
+++ a/mm/memory.c
@@ -131,7 +131,6 @@ static void __sync_task_rss_stat(struct 
 
 	for (i = 0; i < NR_MM_COUNTERS; i++) {
 		if (task->rss_stat.count[i]) {
-			BUG_ON(!mm);
 			add_mm_counter(mm, i, task->rss_stat.count[i]);
 			task->rss_stat.count[i] = 0;
 		}
_

Because we just made sure it can't happen, and if it _does_ happen, the
oops will tell us the samme thing that the BUG_ON() would have.

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: [PATCH] exit: fix oops in sync_mm_rss
  2010-03-31  2:53             ` Minchan Kim
@ 2010-03-31  0:03               ` Andrew Morton
  2010-03-31  3:11                 ` KAMEZAWA Hiroyuki
  0 siblings, 1 reply; 17+ messages in thread
From: Andrew Morton @ 2010-03-31  0:03 UTC (permalink / raw)
  To: Minchan Kim
  Cc: KAMEZAWA Hiroyuki, Michael S. Tsirkin, cl, lee.schermerhorn,
	rientjes, Hugh Dickins, Rik van Riel, Andrea Arcangeli,
	David S. Miller, linux-mm, linux-kernel, Troels Liebe Bentsen,
	linux-bluetooth

On Wed, 31 Mar 2010 11:53:00 +0900 Minchan Kim <minchan.kim@gmail.com> wrote:

> >> I think we should do this too:
> >>
> >> --- a/mm/memory.c~exit-fix-oops-in-sync_mm_rss-fix
> >> +++ a/mm/memory.c
> >> @@ -131,7 +131,6 @@ static void __sync_task_rss_stat(struct
> >>
> >> __ __ __ for (i = 0; i < NR_MM_COUNTERS; i++) {
> >> __ __ __ __ __ __ __ if (task->rss_stat.count[i]) {
> >> - __ __ __ __ __ __ __ __ __ __ BUG_ON(!mm);
> >> __ __ __ __ __ __ __ __ __ __ __ add_mm_counter(mm, i, task->rss_stat.count[i]);
> >> __ __ __ __ __ __ __ __ __ __ __ task->rss_stat.count[i] = 0;
> >> __ __ __ __ __ __ __ }

^^ gargh, gmail.

> >>
> >> Because we just made sure it can't happen, and if it _does_ happen, the
> >> oops will tell us the samme thing that the BUG_ON() would have.
> >>
> >
> > Hmm, then, finaly..
> > ==
> >
> > task->rss_stat wasn't initialized to 0 at copy_process().
> > And __sync_task_rss_stat() should be static.
> > removed BUG_ON(!mm) in __sync_task_rss_stat() for avoiding to show
> > wrong information to code readers. Anyway, if !mm && task->rss_stat
> > has some value, panic will happen.
> >
> > Signed-off-by: KAMEZAWA Hiroyuki <kamezawa.hiroyu@jp.fujitsu.com>
> Reviewed-by: Minchan Kim <minchan.kim@gmail.com>

I think we should keep the

--- a/kernel/exit.c~exit-fix-oops-in-sync_mm_rss
+++ a/kernel/exit.c
@@ -953,7 +953,8 @@ NORET_TYPE void do_exit(long code)
 
 	acct_update_integrals(tsk);
 	/* sync mm's RSS info before statistics gathering */
-	sync_mm_rss(tsk, tsk->mm);
+	if (tsk->mm)
+		sync_mm_rss(tsk, tsk->mm);
 	group_dead = atomic_dec_and_test(&tsk->signal->live);
 	if (group_dead) {
 		hrtimer_cancel(&tsk->signal->real_timer);

really.  Apart from the fact that we'll otherwise perform an empty
NR_MM_COUNTERS loop in __sync_task_rss_stat(), sync_mm_rss() just isn't
set up to handle kernel threads.  Given that the function of
sync_task_mm(from, to) is to move stuff from "from" and into "to", it's
daft to call it with a NULL value of `to'!

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: [PATCH] exit: fix oops in sync_mm_rss
  2010-03-30 20:56 ` Andrew Morton
@ 2010-03-31  0:28   ` KAMEZAWA Hiroyuki
  2010-03-30 21:37     ` Andrew Morton
  0 siblings, 1 reply; 17+ messages in thread
From: KAMEZAWA Hiroyuki @ 2010-03-31  0:28 UTC (permalink / raw)
  To: Andrew Morton
  Cc: Michael S. Tsirkin, cl, lee.schermerhorn, rientjes, Hugh Dickins,
	Rik van Riel, Minchan Kim, Andrea Arcangeli, David S. Miller,
	linux-mm, linux-kernel, Troels Liebe Bentsen, linux-bluetooth

On Tue, 30 Mar 2010 13:56:34 -0700
Andrew Morton <akpm@linux-foundation.org> wrote:

> That new BUG_ON() is triggering in Troels's machine when a bluetooth
> keyboard is enabled or disabled.  See
> (https://bugzilla.kernel.org/show_bug.cgi?id=15648.
> 
> I guess the question is: how did a kernel thread get a non-zero
> task->rss_stat.count[i]?  If that's expected and OK then we will need
> to take some kernel-thread-avoidance action there.
> 
It seems my fault that it's not initialized to be 0 at do_fork(), copy_process.

About do_exit, do_exit() does this check. So, tsk->mm can be NULL.

 949         if (group_dead) {
 950                 hrtimer_cancel(&tsk->signal->real_timer);
 951                 exit_itimers(tsk->signal);
 952                 if (tsk->mm)
 953                         setmax_mm_hiwater_rss(&tsk->signal->maxrss, tsk->mm);
 954         }

> Could whoever fixes this please also make __sync_task_rss_stat()
> static.
> 
Ah, yes. I should do so.

> I'll toss this over to Rafael/Maciej for tracking as a post-2.6.33
> regression.
> 
> Thanks.
> 


==

task->rss_stat wasn't initialized to 0 at copy_process().
at exit, tsk->mm may be NULL.
And __sync_task_rss_stat() should be static.

Signed-off-by: KAMEZAWA Hiroyuki <kamezawa.hiroyu@jp.fujitsu.com>
---
 kernel/exit.c |    3 ++-
 kernel/fork.c |    3 +++
 mm/memory.c   |    2 +-
 3 files changed, 6 insertions(+), 2 deletions(-)

Index: mmotm-2.6.34-Mar24/kernel/exit.c
===================================================================
--- mmotm-2.6.34-Mar24.orig/kernel/exit.c
+++ mmotm-2.6.34-Mar24/kernel/exit.c
@@ -950,7 +950,8 @@ NORET_TYPE void do_exit(long code)
 
 	acct_update_integrals(tsk);
 	/* sync mm's RSS info before statistics gathering */
-	sync_mm_rss(tsk, tsk->mm);
+	if (tsk->mm)
+		sync_mm_rss(tsk, tsk->mm);
 	group_dead = atomic_dec_and_test(&tsk->signal->live);
 	if (group_dead) {
 		hrtimer_cancel(&tsk->signal->real_timer);
Index: mmotm-2.6.34-Mar24/mm/memory.c
===================================================================
--- mmotm-2.6.34-Mar24.orig/mm/memory.c
+++ mmotm-2.6.34-Mar24/mm/memory.c
@@ -124,7 +124,7 @@ core_initcall(init_zero_pfn);
 
 #if defined(SPLIT_RSS_COUNTING)
 
-void __sync_task_rss_stat(struct task_struct *task, struct mm_struct *mm)
+static void __sync_task_rss_stat(struct task_struct *task, struct mm_struct *mm)
 {
 	int i;
 
Index: mmotm-2.6.34-Mar24/kernel/fork.c
===================================================================
--- mmotm-2.6.34-Mar24.orig/kernel/fork.c
+++ mmotm-2.6.34-Mar24/kernel/fork.c
@@ -1060,6 +1060,9 @@ static struct task_struct *copy_process(
 	p->prev_utime = cputime_zero;
 	p->prev_stime = cputime_zero;
 #endif
+#if defined(SPLIT_RSS_COUNTING)
+	memset(&p->rss_stat, 0, sizeof(p->rss_stat));
+#endif
 
 	p->default_timer_slack_ns = current->timer_slack_ns;
 

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: [PATCH] exit: fix oops in sync_mm_rss
  2010-03-30 21:37     ` Andrew Morton
@ 2010-03-31  0:41       ` KAMEZAWA Hiroyuki
  2010-03-30 22:22         ` Andrew Morton
  2010-03-31  1:57         ` Minchan Kim
  0 siblings, 2 replies; 17+ messages in thread
From: KAMEZAWA Hiroyuki @ 2010-03-31  0:41 UTC (permalink / raw)
  To: Andrew Morton
  Cc: Michael S. Tsirkin, cl, lee.schermerhorn, rientjes, Hugh Dickins,
	Rik van Riel, Minchan Kim, Andrea Arcangeli, David S. Miller,
	linux-mm, linux-kernel, Troels Liebe Bentsen, linux-bluetooth

On Tue, 30 Mar 2010 17:37:21 -0400
Andrew Morton <akpm@linux-foundation.org> wrote:

> On Wed, 31 Mar 2010 09:28:15 +0900 KAMEZAWA Hiroyuki <kamezawa.hiroyu@jp.fujitsu.com> wrote:
> 
> > On Tue, 30 Mar 2010 13:56:34 -0700
> > Andrew Morton <akpm@linux-foundation.org> wrote:
> > 
> > > That new BUG_ON() is triggering in Troels's machine when a bluetooth
> > > keyboard is enabled or disabled.  See
> > > (https://bugzilla.kernel.org/show_bug.cgi?id=15648.
> > > 
> > > I guess the question is: how did a kernel thread get a non-zero
> > > task->rss_stat.count[i]?  If that's expected and OK then we will need
> > > to take some kernel-thread-avoidance action there.
> > > 
> > It seems my fault that it's not initialized to be 0 at do_fork(), copy_process.
> > 
> > About do_exit, do_exit() does this check. So, tsk->mm can be NULL.
> > 
> >  949         if (group_dead) {
> >  950                 hrtimer_cancel(&tsk->signal->real_timer);
> >  951                 exit_itimers(tsk->signal);
> >  952                 if (tsk->mm)
> >  953                         setmax_mm_hiwater_rss(&tsk->signal->maxrss, tsk->mm);
> >  954         }
> > 
> > > Could whoever fixes this please also make __sync_task_rss_stat()
> > > static.
> > > 
> > Ah, yes. I should do so.
> > 
> > > I'll toss this over to Rafael/Maciej for tracking as a post-2.6.33
> > > regression.
> > > 
> > > Thanks.
> > > 
> > 
> > 
> > ==
> > 
> > task->rss_stat wasn't initialized to 0 at copy_process().
> > at exit, tsk->mm may be NULL.
> > And __sync_task_rss_stat() should be static.
> > 
> > Signed-off-by: KAMEZAWA Hiroyuki <kamezawa.hiroyu@jp.fujitsu.com>
> > ---
> >  kernel/exit.c |    3 ++-
> >  kernel/fork.c |    3 +++
> >  mm/memory.c   |    2 +-
> >  3 files changed, 6 insertions(+), 2 deletions(-)
> > 
> > Index: mmotm-2.6.34-Mar24/kernel/exit.c
> > ===================================================================
> > --- mmotm-2.6.34-Mar24.orig/kernel/exit.c
> > +++ mmotm-2.6.34-Mar24/kernel/exit.c
> > @@ -950,7 +950,8 @@ NORET_TYPE void do_exit(long code)
> >  
> >  	acct_update_integrals(tsk);
> >  	/* sync mm's RSS info before statistics gathering */
> > -	sync_mm_rss(tsk, tsk->mm);
> > +	if (tsk->mm)
> > +		sync_mm_rss(tsk, tsk->mm);
> >  	group_dead = atomic_dec_and_test(&tsk->signal->live);
> >  	if (group_dead) {
> >  		hrtimer_cancel(&tsk->signal->real_timer);
> > Index: mmotm-2.6.34-Mar24/mm/memory.c
> > ===================================================================
> > --- mmotm-2.6.34-Mar24.orig/mm/memory.c
> > +++ mmotm-2.6.34-Mar24/mm/memory.c
> > @@ -124,7 +124,7 @@ core_initcall(init_zero_pfn);
> >  
> >  #if defined(SPLIT_RSS_COUNTING)
> >  
> > -void __sync_task_rss_stat(struct task_struct *task, struct mm_struct *mm)
> > +static void __sync_task_rss_stat(struct task_struct *task, struct mm_struct *mm)
> >  {
> >  	int i;
> >  
> > Index: mmotm-2.6.34-Mar24/kernel/fork.c
> > ===================================================================
> > --- mmotm-2.6.34-Mar24.orig/kernel/fork.c
> > +++ mmotm-2.6.34-Mar24/kernel/fork.c
> > @@ -1060,6 +1060,9 @@ static struct task_struct *copy_process(
> >  	p->prev_utime = cputime_zero;
> >  	p->prev_stime = cputime_zero;
> >  #endif
> > +#if defined(SPLIT_RSS_COUNTING)
> > +	memset(&p->rss_stat, 0, sizeof(p->rss_stat));
> > +#endif
> >  
> >  	p->default_timer_slack_ns = current->timer_slack_ns;
> 
> OK, so the kenrel thread inherited a non-zero rss_stat from a userspace
> parent?
> 
I think so.

> With this fixed, the test for non-zero tsk->mm is't really needed in
> do_exit(), is it?  I guess it makes sense though - sync_mm_rss() only
> really works for kernel threads by luck..

At first, I considered so, too. But I changed my mind to show
"we know tsk->mm can be NULL here!" by code. 
Because __sync_mm_rss_stat() has BUG_ON(!mm), the code reader will think
tsk->mm shouldn't be NULL always.

Doesn't make sense ?

Thanks,
-Kame



--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: [PATCH] exit: fix oops in sync_mm_rss
  2010-03-30 22:22         ` Andrew Morton
@ 2010-03-31  1:27           ` KAMEZAWA Hiroyuki
  2010-03-31  2:53             ` Minchan Kim
  0 siblings, 1 reply; 17+ messages in thread
From: KAMEZAWA Hiroyuki @ 2010-03-31  1:27 UTC (permalink / raw)
  To: Andrew Morton
  Cc: Michael S. Tsirkin, cl, lee.schermerhorn, rientjes, Hugh Dickins,
	Rik van Riel, Minchan Kim, Andrea Arcangeli, David S. Miller,
	linux-mm, linux-kernel, Troels Liebe Bentsen, linux-bluetooth

On Tue, 30 Mar 2010 18:22:58 -0400
Andrew Morton <akpm@linux-foundation.org> wrote:

> On Wed, 31 Mar 2010 09:41:24 +0900 KAMEZAWA Hiroyuki <kamezawa.hiroyu@jp.fujitsu.com> wrote:
> 
> > > With this fixed, the test for non-zero tsk->mm is't really needed in
> > > do_exit(), is it?  I guess it makes sense though - sync_mm_rss() only
> > > really works for kernel threads by luck..
> > 
> > At first, I considered so, too. But I changed my mind to show
> > "we know tsk->mm can be NULL here!" by code. 
> > Because __sync_mm_rss_stat() has BUG_ON(!mm), the code reader will think
> > tsk->mm shouldn't be NULL always.
> > 
> > Doesn't make sense ?
> 
> uh, not really ;)
> 
> 
> I think we should do this too:
> 
> --- a/mm/memory.c~exit-fix-oops-in-sync_mm_rss-fix
> +++ a/mm/memory.c
> @@ -131,7 +131,6 @@ static void __sync_task_rss_stat(struct 
>  
>  	for (i = 0; i < NR_MM_COUNTERS; i++) {
>  		if (task->rss_stat.count[i]) {
> -			BUG_ON(!mm);
>  			add_mm_counter(mm, i, task->rss_stat.count[i]);
>  			task->rss_stat.count[i] = 0;
>  		}
> _
> 
> Because we just made sure it can't happen, and if it _does_ happen, the
> oops will tell us the samme thing that the BUG_ON() would have.
> 

Hmm, then, finaly..
==

task->rss_stat wasn't initialized to 0 at copy_process().
And __sync_task_rss_stat() should be static.
removed BUG_ON(!mm) in __sync_task_rss_stat() for avoiding to show
wrong information to code readers. Anyway, if !mm && task->rss_stat
has some value, panic will happen.

Signed-off-by: KAMEZAWA Hiroyuki <kamezawa.hiroyu@jp.fujitsu.com>
---
 kernel/fork.c |    3 +++
 mm/memory.c   |    3 +--
 2 files changed, 4 insertions(+), 2 deletions(-)

Index: mmotm-2.6.34-Mar24/mm/memory.c
===================================================================
--- mmotm-2.6.34-Mar24.orig/mm/memory.c
+++ mmotm-2.6.34-Mar24/mm/memory.c
@@ -124,13 +124,12 @@ core_initcall(init_zero_pfn);
 
 #if defined(SPLIT_RSS_COUNTING)
 
-void __sync_task_rss_stat(struct task_struct *task, struct mm_struct *mm)
+static void __sync_task_rss_stat(struct task_struct *task, struct mm_struct *mm)
 {
 	int i;
 
 	for (i = 0; i < NR_MM_COUNTERS; i++) {
 		if (task->rss_stat.count[i]) {
-			BUG_ON(!mm);
 			add_mm_counter(mm, i, task->rss_stat.count[i]);
 			task->rss_stat.count[i] = 0;
 		}
Index: mmotm-2.6.34-Mar24/kernel/fork.c
===================================================================
--- mmotm-2.6.34-Mar24.orig/kernel/fork.c
+++ mmotm-2.6.34-Mar24/kernel/fork.c
@@ -1060,6 +1060,9 @@ static struct task_struct *copy_process(
 	p->prev_utime = cputime_zero;
 	p->prev_stime = cputime_zero;
 #endif
+#if defined(SPLIT_RSS_COUNTING)
+	memset(&p->rss_stat, 0, sizeof(p->rss_stat));
+#endif
 
 	p->default_timer_slack_ns = current->timer_slack_ns;
 







--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: [PATCH] exit: fix oops in sync_mm_rss
  2010-03-31  0:41       ` KAMEZAWA Hiroyuki
  2010-03-30 22:22         ` Andrew Morton
@ 2010-03-31  1:57         ` Minchan Kim
  2010-03-31  2:12           ` KAMEZAWA Hiroyuki
  1 sibling, 1 reply; 17+ messages in thread
From: Minchan Kim @ 2010-03-31  1:57 UTC (permalink / raw)
  To: KAMEZAWA Hiroyuki
  Cc: Andrew Morton, Michael S. Tsirkin, cl, lee.schermerhorn, rientjes,
	Hugh Dickins, Rik van Riel, Andrea Arcangeli, David S. Miller,
	linux-mm, linux-kernel, Troels Liebe Bentsen, linux-bluetooth

On Wed, Mar 31, 2010 at 9:41 AM, KAMEZAWA Hiroyuki
<kamezawa.hiroyu@jp.fujitsu.com> wrote:
> On Tue, 30 Mar 2010 17:37:21 -0400
> Andrew Morton <akpm@linux-foundation.org> wrote:
>
>> On Wed, 31 Mar 2010 09:28:15 +0900 KAMEZAWA Hiroyuki <kamezawa.hiroyu@jp.fujitsu.com> wrote:
>>
>> > On Tue, 30 Mar 2010 13:56:34 -0700
>> > Andrew Morton <akpm@linux-foundation.org> wrote:
>> >
>> > > That new BUG_ON() is triggering in Troels's machine when a bluetooth
>> > > keyboard is enabled or disabled.  See
>> > > (https://bugzilla.kernel.org/show_bug.cgi?id=15648.
>> > >
>> > > I guess the question is: how did a kernel thread get a non-zero
>> > > task->rss_stat.count[i]?  If that's expected and OK then we will need
>> > > to take some kernel-thread-avoidance action there.
>> > >
>> > It seems my fault that it's not initialized to be 0 at do_fork(), copy_process.
>> >
>> > About do_exit, do_exit() does this check. So, tsk->mm can be NULL.
>> >
>> >  949         if (group_dead) {
>> >  950                 hrtimer_cancel(&tsk->signal->real_timer);
>> >  951                 exit_itimers(tsk->signal);
>> >  952                 if (tsk->mm)
>> >  953                         setmax_mm_hiwater_rss(&tsk->signal->maxrss, tsk->mm);
>> >  954         }
>> >
>> > > Could whoever fixes this please also make __sync_task_rss_stat()
>> > > static.
>> > >
>> > Ah, yes. I should do so.
>> >
>> > > I'll toss this over to Rafael/Maciej for tracking as a post-2.6.33
>> > > regression.
>> > >
>> > > Thanks.
>> > >
>> >
>> >
>> > ==
>> >
>> > task->rss_stat wasn't initialized to 0 at copy_process().
>> > at exit, tsk->mm may be NULL.
>> > And __sync_task_rss_stat() should be static.
>> >
>> > Signed-off-by: KAMEZAWA Hiroyuki <kamezawa.hiroyu@jp.fujitsu.com>
>> > ---
>> >  kernel/exit.c |    3 ++-
>> >  kernel/fork.c |    3 +++
>> >  mm/memory.c   |    2 +-
>> >  3 files changed, 6 insertions(+), 2 deletions(-)
>> >
>> > Index: mmotm-2.6.34-Mar24/kernel/exit.c
>> > ===================================================================
>> > --- mmotm-2.6.34-Mar24.orig/kernel/exit.c
>> > +++ mmotm-2.6.34-Mar24/kernel/exit.c
>> > @@ -950,7 +950,8 @@ NORET_TYPE void do_exit(long code)
>> >
>> >     acct_update_integrals(tsk);
>> >     /* sync mm's RSS info before statistics gathering */
>> > -   sync_mm_rss(tsk, tsk->mm);
>> > +   if (tsk->mm)
>> > +           sync_mm_rss(tsk, tsk->mm);
>> >     group_dead = atomic_dec_and_test(&tsk->signal->live);
>> >     if (group_dead) {
>> >             hrtimer_cancel(&tsk->signal->real_timer);
>> > Index: mmotm-2.6.34-Mar24/mm/memory.c
>> > ===================================================================
>> > --- mmotm-2.6.34-Mar24.orig/mm/memory.c
>> > +++ mmotm-2.6.34-Mar24/mm/memory.c
>> > @@ -124,7 +124,7 @@ core_initcall(init_zero_pfn);
>> >
>> >  #if defined(SPLIT_RSS_COUNTING)
>> >
>> > -void __sync_task_rss_stat(struct task_struct *task, struct mm_struct *mm)
>> > +static void __sync_task_rss_stat(struct task_struct *task, struct mm_struct *mm)
>> >  {
>> >     int i;
>> >
>> > Index: mmotm-2.6.34-Mar24/kernel/fork.c
>> > ===================================================================
>> > --- mmotm-2.6.34-Mar24.orig/kernel/fork.c
>> > +++ mmotm-2.6.34-Mar24/kernel/fork.c
>> > @@ -1060,6 +1060,9 @@ static struct task_struct *copy_process(
>> >     p->prev_utime = cputime_zero;
>> >     p->prev_stime = cputime_zero;
>> >  #endif
>> > +#if defined(SPLIT_RSS_COUNTING)
>> > +   memset(&p->rss_stat, 0, sizeof(p->rss_stat));
>> > +#endif
>> >
>> >     p->default_timer_slack_ns = current->timer_slack_ns;
>>
>> OK, so the kenrel thread inherited a non-zero rss_stat from a userspace
>> parent?
>>
> I think so.
>
>> With this fixed, the test for non-zero tsk->mm is't really needed in
>> do_exit(), is it?  I guess it makes sense though - sync_mm_rss() only
>> really works for kernel threads by luck..
>
> At first, I considered so, too. But I changed my mind to show
> "we know tsk->mm can be NULL here!" by code.
> Because __sync_mm_rss_stat() has BUG_ON(!mm), the code reader will think
> tsk->mm shouldn't be NULL always.
>
> Doesn't make sense ?
>

Nitpick.
How about moving sync_mm_rss into after check !mm of exit_mm?



-- 
Kind regards,
Minchan Kim

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: [PATCH] exit: fix oops in sync_mm_rss
  2010-03-31  1:57         ` Minchan Kim
@ 2010-03-31  2:12           ` KAMEZAWA Hiroyuki
  2010-03-31  2:48             ` Minchan Kim
  0 siblings, 1 reply; 17+ messages in thread
From: KAMEZAWA Hiroyuki @ 2010-03-31  2:12 UTC (permalink / raw)
  To: Minchan Kim
  Cc: Andrew Morton, Michael S. Tsirkin, cl, lee.schermerhorn, rientjes,
	Hugh Dickins, Rik van Riel, Andrea Arcangeli, David S. Miller,
	linux-mm, linux-kernel, Troels Liebe Bentsen, linux-bluetooth

On Wed, 31 Mar 2010 10:57:18 +0900
Minchan Kim <minchan.kim@gmail.com> wrote:

> On Wed, Mar 31, 2010 at 9:41 AM, KAMEZAWA Hiroyuki

> > Doesn't make sense ?
> >
> 
> Nitpick.
> How about moving sync_mm_rss into after check !mm of exit_mm?
> 
Hmm.
==
        sync_mm_rss(tsk, tsk->mm);
        group_dead = atomic_dec_and_test(&tsk->signal->live);
        if (group_dead) {
                hrtimer_cancel(&tsk->signal->real_timer);
                exit_itimers(tsk->signal);
                if (tsk->mm)
                        setmax_mm_hiwater_rss(&tsk->signal->maxrss, tsk->mm); ---(**)
        }
        acct_collect(code, group_dead);
        if (group_dead)
                tty_audit_exit();
        if (unlikely(tsk->audit_context))
                audit_free(tsk);

        tsk->exit_code = code;
        taskstats_exit(tsk, group_dead); --------(*)
	
        exit_mm(tsk);
==
task_acct routine has to handle mm information (*).
So, we have to sync somewhere before exit_mm() or tsk->mm goes to NULL.

I think taskstat is an only acct hook which gatheres mm's rss information
but I placed sync_mm_rss() before all accounting routine.
Anyway, sync_mm_rss() should be before (**)
setmax_mm_hiwater_rss(&tsk->signal->maxrss, tsk->mm);

Thanks,
-Kame 




--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: [PATCH] exit: fix oops in sync_mm_rss
  2010-03-31  2:12           ` KAMEZAWA Hiroyuki
@ 2010-03-31  2:48             ` Minchan Kim
  0 siblings, 0 replies; 17+ messages in thread
From: Minchan Kim @ 2010-03-31  2:48 UTC (permalink / raw)
  To: KAMEZAWA Hiroyuki
  Cc: Andrew Morton, Michael S. Tsirkin, cl, lee.schermerhorn, rientjes,
	Hugh Dickins, Rik van Riel, Andrea Arcangeli, David S. Miller,
	linux-mm, linux-kernel, Troels Liebe Bentsen, linux-bluetooth

On Wed, Mar 31, 2010 at 11:12 AM, KAMEZAWA Hiroyuki
<kamezawa.hiroyu@jp.fujitsu.com> wrote:
> On Wed, 31 Mar 2010 10:57:18 +0900
> Minchan Kim <minchan.kim@gmail.com> wrote:
>
>> On Wed, Mar 31, 2010 at 9:41 AM, KAMEZAWA Hiroyuki
>
>> > Doesn't make sense ?
>> >
>>
>> Nitpick.
>> How about moving sync_mm_rss into after check !mm of exit_mm?
>>
> Hmm.
> ==
>        sync_mm_rss(tsk, tsk->mm);
>        group_dead = atomic_dec_and_test(&tsk->signal->live);
>        if (group_dead) {
>                hrtimer_cancel(&tsk->signal->real_timer);
>                exit_itimers(tsk->signal);
>                if (tsk->mm)
>                        setmax_mm_hiwater_rss(&tsk->signal->maxrss, tsk->mm); ---(**)
>        }
>        acct_collect(code, group_dead);
>        if (group_dead)
>                tty_audit_exit();
>        if (unlikely(tsk->audit_context))
>                audit_free(tsk);
>
>        tsk->exit_code = code;
>        taskstats_exit(tsk, group_dead); --------(*)
>
>        exit_mm(tsk);
> ==
> task_acct routine has to handle mm information (*).

Indeed. I missed that.
Thanks, Kame.


-- 
Kind regards,
Minchan Kim

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: [PATCH] exit: fix oops in sync_mm_rss
  2010-03-31  1:27           ` KAMEZAWA Hiroyuki
@ 2010-03-31  2:53             ` Minchan Kim
  2010-03-31  0:03               ` Andrew Morton
  0 siblings, 1 reply; 17+ messages in thread
From: Minchan Kim @ 2010-03-31  2:53 UTC (permalink / raw)
  To: KAMEZAWA Hiroyuki
  Cc: Andrew Morton, Michael S. Tsirkin, cl, lee.schermerhorn, rientjes,
	Hugh Dickins, Rik van Riel, Andrea Arcangeli, David S. Miller,
	linux-mm, linux-kernel, Troels Liebe Bentsen, linux-bluetooth

On Wed, Mar 31, 2010 at 10:27 AM, KAMEZAWA Hiroyuki
<kamezawa.hiroyu@jp.fujitsu.com> wrote:
> On Tue, 30 Mar 2010 18:22:58 -0400
> Andrew Morton <akpm@linux-foundation.org> wrote:
>
>> On Wed, 31 Mar 2010 09:41:24 +0900 KAMEZAWA Hiroyuki <kamezawa.hiroyu@jp.fujitsu.com> wrote:
>>
>> > > With this fixed, the test for non-zero tsk->mm is't really needed in
>> > > do_exit(), is it?  I guess it makes sense though - sync_mm_rss() only
>> > > really works for kernel threads by luck..
>> >
>> > At first, I considered so, too. But I changed my mind to show
>> > "we know tsk->mm can be NULL here!" by code.
>> > Because __sync_mm_rss_stat() has BUG_ON(!mm), the code reader will think
>> > tsk->mm shouldn't be NULL always.
>> >
>> > Doesn't make sense ?
>>
>> uh, not really ;)
>>
>>
>> I think we should do this too:
>>
>> --- a/mm/memory.c~exit-fix-oops-in-sync_mm_rss-fix
>> +++ a/mm/memory.c
>> @@ -131,7 +131,6 @@ static void __sync_task_rss_stat(struct
>>
>>       for (i = 0; i < NR_MM_COUNTERS; i++) {
>>               if (task->rss_stat.count[i]) {
>> -                     BUG_ON(!mm);
>>                       add_mm_counter(mm, i, task->rss_stat.count[i]);
>>                       task->rss_stat.count[i] = 0;
>>               }
>> _
>>
>> Because we just made sure it can't happen, and if it _does_ happen, the
>> oops will tell us the samme thing that the BUG_ON() would have.
>>
>
> Hmm, then, finaly..
> ==
>
> task->rss_stat wasn't initialized to 0 at copy_process().
> And __sync_task_rss_stat() should be static.
> removed BUG_ON(!mm) in __sync_task_rss_stat() for avoiding to show
> wrong information to code readers. Anyway, if !mm && task->rss_stat
> has some value, panic will happen.
>
> Signed-off-by: KAMEZAWA Hiroyuki <kamezawa.hiroyu@jp.fujitsu.com>
Reviewed-by: Minchan Kim <minchan.kim@gmail.com>


-- 
Kind regards,
Minchan Kim

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: [PATCH] exit: fix oops in sync_mm_rss
  2010-03-31  0:03               ` Andrew Morton
@ 2010-03-31  3:11                 ` KAMEZAWA Hiroyuki
  0 siblings, 0 replies; 17+ messages in thread
From: KAMEZAWA Hiroyuki @ 2010-03-31  3:11 UTC (permalink / raw)
  To: Andrew Morton
  Cc: Minchan Kim, Michael S. Tsirkin, cl, lee.schermerhorn, rientjes,
	Hugh Dickins, Rik van Riel, Andrea Arcangeli, David S. Miller,
	linux-mm, linux-kernel, Troels Liebe Bentsen, linux-bluetooth

On Tue, 30 Mar 2010 20:03:36 -0400
Andrew Morton <akpm@linux-foundation.org> wrote:

> On Wed, 31 Mar 2010 11:53:00 +0900 Minchan Kim <minchan.kim@gmail.com> wrote:
 
> really.  Apart from the fact that we'll otherwise perform an empty
> NR_MM_COUNTERS loop in __sync_task_rss_stat(), sync_mm_rss() just isn't
> set up to handle kernel threads.  Given that the function of
> sync_task_mm(from, to) is to move stuff from "from" and into "to", it's
> daft to call it with a NULL value of `to'!
> 
Updated again.

==

task->rss_stat wasn't initialized to 0 at copy_process().
At exit, tsk->mm may be NULL. It's not valid to call sync_mm_rss()
against not exisiting mm_struct, We should check it.
And __sync_task_rss_stat() should be static.
This patch also removes BUG_ON(!mm) in __sync_task_rss_stat().
The code will panic if !mm without it.

Changelog:
 - removed BUG_ON()
 - added check task->mm in exit.c before calling sync_mm_rss().

Reviewed-by: Minchan Kim <minchan.kim@gmail.com>
Signed-off-by: KAMEZAWA Hiroyuki <kamezawa.hiroyu@jp.fujitsu.com>
---
 kernel/exit.c |    3 ++-
 kernel/fork.c |    3 +++
 mm/memory.c   |    3 +--
 3 files changed, 6 insertions(+), 3 deletions(-)

Index: mmotm-2.6.34-Mar24/mm/memory.c
===================================================================
--- mmotm-2.6.34-Mar24.orig/mm/memory.c
+++ mmotm-2.6.34-Mar24/mm/memory.c
@@ -124,13 +124,12 @@ core_initcall(init_zero_pfn);
 
 #if defined(SPLIT_RSS_COUNTING)
 
-void __sync_task_rss_stat(struct task_struct *task, struct mm_struct *mm)
+static void __sync_task_rss_stat(struct task_struct *task, struct mm_struct *mm)
 {
 	int i;
 
 	for (i = 0; i < NR_MM_COUNTERS; i++) {
 		if (task->rss_stat.count[i]) {
-			BUG_ON(!mm);
 			add_mm_counter(mm, i, task->rss_stat.count[i]);
 			task->rss_stat.count[i] = 0;
 		}
Index: mmotm-2.6.34-Mar24/kernel/fork.c
===================================================================
--- mmotm-2.6.34-Mar24.orig/kernel/fork.c
+++ mmotm-2.6.34-Mar24/kernel/fork.c
@@ -1060,6 +1060,9 @@ static struct task_struct *copy_process(
 	p->prev_utime = cputime_zero;
 	p->prev_stime = cputime_zero;
 #endif
+#if defined(SPLIT_RSS_COUNTING)
+	memset(&p->rss_stat, 0, sizeof(p->rss_stat));
+#endif
 
 	p->default_timer_slack_ns = current->timer_slack_ns;
 
Index: mmotm-2.6.34-Mar24/kernel/exit.c
===================================================================
--- mmotm-2.6.34-Mar24.orig/kernel/exit.c
+++ mmotm-2.6.34-Mar24/kernel/exit.c
@@ -950,7 +950,8 @@ NORET_TYPE void do_exit(long code)
 
 	acct_update_integrals(tsk);
 	/* sync mm's RSS info before statistics gathering */
-	sync_mm_rss(tsk, tsk->mm);
+	if (tsk->mm)
+		sync_mm_rss(tsk, tsk->mm);
 	group_dead = atomic_dec_and_test(&tsk->signal->live);
 	if (group_dead) {
 		hrtimer_cancel(&tsk->signal->real_timer);

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 17+ messages in thread

end of thread, other threads:[~2010-03-31  3:15 UTC | newest]

Thread overview: 17+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2010-03-16 17:08 [PATCH] exit: fix oops in sync_mm_rss Michael S. Tsirkin
2010-03-16 17:51 ` Andrea Arcangeli
2010-03-16 17:52 ` Rik van Riel
2010-03-16 23:41 ` KAMEZAWA Hiroyuki
2010-03-17  2:26 ` Minchan Kim
2010-03-30 20:56 ` Andrew Morton
2010-03-31  0:28   ` KAMEZAWA Hiroyuki
2010-03-30 21:37     ` Andrew Morton
2010-03-31  0:41       ` KAMEZAWA Hiroyuki
2010-03-30 22:22         ` Andrew Morton
2010-03-31  1:27           ` KAMEZAWA Hiroyuki
2010-03-31  2:53             ` Minchan Kim
2010-03-31  0:03               ` Andrew Morton
2010-03-31  3:11                 ` KAMEZAWA Hiroyuki
2010-03-31  1:57         ` Minchan Kim
2010-03-31  2:12           ` KAMEZAWA Hiroyuki
2010-03-31  2:48             ` Minchan Kim

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).