All of lore.kernel.org
 help / color / mirror / Atom feed
From: Zlatko Calusic <zlatko.calusic@iskon.hr>
To: Zhouping Liu <zliu@redhat.com>
Cc: linux-mm@kvack.org, linux-kernel@vger.kernel.org,
	Ingo Molnar <mingo@redhat.com>,
	Johannes Weiner <jweiner@redhat.com>,
	mgorman@suse.de, hughd@google.com,
	Andrea Arcangeli <aarcange@redhat.com>,
	Hillf Danton <dhillf@gmail.com>
Subject: Re: BUG: unable to handle kernel NULL pointer dereference at 0000000000000500
Date: Thu, 27 Dec 2012 15:58:51 +0100	[thread overview]
Message-ID: <50DC622B.7000802@iskon.hr> (raw)
In-Reply-To: <692539675.35132464.1356520940797.JavaMail.root@redhat.com>

On 26.12.2012 12:22, Zhouping Liu wrote:
> Hello everyone,
> 
> The latest mainline(637704cbc95c) would trigger the following error when the system was under
> some pressure condition(in my testing, I used oom01 case inside LTP test suite to trigger the issue):
> 
> [ 5462.920151] BUG: unable to handle kernel NULL pointer dereference at 0000000000000500
> [ 5462.927991] IP: [<ffffffff811542d9>] wait_iff_congested+0x59/0x140
> [ 5462.934176] PGD 0
> [ 5462.936191] Oops: 0000 [#2] SMP
> [ 5462.939428] Modules linked in: lockd sunrpc iptable_mangle ipt_REJECT nf_conntrack_ipv4 nf_defrag_ipv4 xt_conntrack nf_conntrack ebtable_filter ebtables ip6table_filter ip6_tables iptable_filter ip_tabled
> [ 5462.984261] CPU 13
> [ 5462.986184] Pid: 117, comm: kswapd3 Tainted: G      D      3.8.0-rc1+ #1 Dell Inc. PowerEdge M905/0D413F
> [ 5462.995814] RIP: 0010:[<ffffffff811542d9>]  [<ffffffff811542d9>] wait_iff_congested+0x59/0x140
> [ 5463.004411] RSP: 0018:ffff88007c97fd48  EFLAGS: 00010202
> [ 5463.009701] RAX: 0000000000000001 RBX: 0000000000000064 RCX: 0000000000000001
> [ 5463.016818] RDX: 0000000000000064 RSI: 0000000000000000 RDI: 0000000000000000
> [ 5463.023926] RBP: ffff88007c97fd98 R08: 0000000000000000 R09: ffff88022ffd9d80
> [ 5463.031033] R10: 0000000000003189 R11: 0000000000000000 R12: 00000001004ee87e
> [ 5463.038140] R13: 0000000000000002 R14: 0000000000000000 R15: ffff88022ffd9000
> [ 5463.045258] FS:  00007f3e570de740(0000) GS:ffff88022fcc0000(0000) knlGS:0000000000000000
> [ 5463.053317] CS:  0010 DS: 0000 ES: 0000 CR0: 000000008005003b
> [ 5463.059041] CR2: 0000000000000500 CR3: 00000000018dc000 CR4: 00000000000007e0
> [ 5463.066157] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
> [ 5463.073276] DR3: 0000000000000000 DR6: 00000000ffff0ff0 DR7: 0000000000000400
> [ 5463.080400] Process kswapd3 (pid: 117, threadinfo ffff88007c97e000, task ffff88007c981970)
> [ 5463.088633] Stack:
> [ 5463.090646]  ffff88007c97fd98 0000000000000000 ffff88007c981970 ffffffff81086080
> [ 5463.098090]  ffff88007c97fd68 ffff88007c97fd68 ffff88022ffd9d80 0000000000000002
> [ 5463.105527]  0000000000000002 0000000000000000 ffff88007c97feb8 ffffffff8114b0e3
> [ 5463.112998] Call Trace:
> [ 5463.115446]  [<ffffffff81086080>] ? wake_up_bit+0x40/0x40
> [ 5463.120826]  [<ffffffff8114b0e3>] kswapd+0x6c3/0xa50
> [ 5463.125775]  [<ffffffff8114aa20>] ? zone_reclaim+0x270/0x270
> [ 5463.131415]  [<ffffffff81085680>] kthread+0xc0/0xd0
> [ 5463.136278]  [<ffffffff810855c0>] ? kthread_create_on_node+0x120/0x120
> [ 5463.142786]  [<ffffffff8160a0ac>] ret_from_fork+0x7c/0xb0
> [ 5463.148166]  [<ffffffff810855c0>] ? kthread_create_on_node+0x120/0x120
> [ 5463.154668] Code: 4e 6d 88 00 48 c7 45 b8 00 00 00 00 48 83 c0 18 48 c7 45 c8 80 60 08 81 48 89 45 d0 48 89 45 d8 8b 04 b5 a0 9a cd 81 85 c0 74 0f <48> 8b 87 00 05 00 00 a8 04 0f 85 98 00 00 00 e8 b3 c3
> [ 5463.174097] RIP  [<ffffffff811542d9>] wait_iff_congested+0x59/0x140
> [ 5463.180352]  RSP <ffff88007c97fd48>
> [ 5463.183824] CR2: 0000000000000500
> [ 5463.203717] ---[ end trace 9ff4ff9087c13a36 ]---
> 
> I attached the config file, hope it can make some help.
> 
> Thanks,
> Zhouping
> 

Thank you for the report Zhouping!

Would you be so kind to test the following patch and report results? Apply the patch to the latest mainline.

Thanks,

diff --git a/mm/vmscan.c b/mm/vmscan.c
index 23291b9..e55ce55 100644
--- a/mm/vmscan.c
+++ b/mm/vmscan.c
@@ -2564,6 +2564,7 @@ static bool prepare_kswapd_sleep(pg_data_t *pgdat, int order, long remaining,
 static unsigned long balance_pgdat(pg_data_t *pgdat, int order,
 							int *classzone_idx)
 {
+	bool pgdat_is_balanced = false;
 	struct zone *unbalanced_zone;
 	int i;
 	int end_zone = 0;	/* Inclusive.  0 = ZONE_DMA */
@@ -2638,8 +2639,11 @@ loop_again:
 				zone_clear_flag(zone, ZONE_CONGESTED);
 			}
 		}
-		if (i < 0)
+
+		if (i < 0) {
+			pgdat_is_balanced = true;
 			goto out;
+		}
 
 		for (i = 0; i <= end_zone; i++) {
 			struct zone *zone = pgdat->node_zones + i;
@@ -2766,8 +2770,11 @@ loop_again:
 				pfmemalloc_watermark_ok(pgdat))
 			wake_up(&pgdat->pfmemalloc_wait);
 
-		if (pgdat_balanced(pgdat, order, *classzone_idx))
+		if (pgdat_balanced(pgdat, order, *classzone_idx)) {
+			pgdat_is_balanced = true;
 			break;		/* kswapd: all done */
+		}
+
 		/*
 		 * OK, kswapd is getting into trouble.  Take a nap, then take
 		 * another pass across the zones.
@@ -2775,7 +2782,7 @@ loop_again:
 		if (total_scanned && (sc.priority < DEF_PRIORITY - 2)) {
 			if (has_under_min_watermark_zone)
 				count_vm_event(KSWAPD_SKIP_CONGESTION_WAIT);
-			else
+			else if (unbalanced_zone)
 				wait_iff_congested(unbalanced_zone, BLK_RW_ASYNC, HZ/10);
 		}
 
@@ -2788,9 +2795,9 @@ loop_again:
 		if (sc.nr_reclaimed >= SWAP_CLUSTER_MAX)
 			break;
 	} while (--sc.priority >= 0);
-out:
 
-	if (!pgdat_balanced(pgdat, order, *classzone_idx)) {
+out:
+	if (!pgdat_is_balanced) {
 		cond_resched();
 
 		try_to_freeze();

-- 
Zlatko

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

WARNING: multiple messages have this Message-ID (diff)
From: Zlatko Calusic <zlatko.calusic@iskon.hr>
To: Zhouping Liu <zliu@redhat.com>
Cc: linux-mm@kvack.org, linux-kernel@vger.kernel.org,
	Ingo Molnar <mingo@redhat.com>,
	Johannes Weiner <jweiner@redhat.com>,
	mgorman@suse.de, hughd@google.com,
	Andrea Arcangeli <aarcange@redhat.com>,
	Hillf Danton <dhillf@gmail.com>
Subject: Re: BUG: unable to handle kernel NULL pointer dereference at 0000000000000500
Date: Thu, 27 Dec 2012 15:58:51 +0100	[thread overview]
Message-ID: <50DC622B.7000802@iskon.hr> (raw)
In-Reply-To: <692539675.35132464.1356520940797.JavaMail.root@redhat.com>

On 26.12.2012 12:22, Zhouping Liu wrote:
> Hello everyone,
> 
> The latest mainline(637704cbc95c) would trigger the following error when the system was under
> some pressure condition(in my testing, I used oom01 case inside LTP test suite to trigger the issue):
> 
> [ 5462.920151] BUG: unable to handle kernel NULL pointer dereference at 0000000000000500
> [ 5462.927991] IP: [<ffffffff811542d9>] wait_iff_congested+0x59/0x140
> [ 5462.934176] PGD 0
> [ 5462.936191] Oops: 0000 [#2] SMP
> [ 5462.939428] Modules linked in: lockd sunrpc iptable_mangle ipt_REJECT nf_conntrack_ipv4 nf_defrag_ipv4 xt_conntrack nf_conntrack ebtable_filter ebtables ip6table_filter ip6_tables iptable_filter ip_tabled
> [ 5462.984261] CPU 13
> [ 5462.986184] Pid: 117, comm: kswapd3 Tainted: G      D      3.8.0-rc1+ #1 Dell Inc. PowerEdge M905/0D413F
> [ 5462.995814] RIP: 0010:[<ffffffff811542d9>]  [<ffffffff811542d9>] wait_iff_congested+0x59/0x140
> [ 5463.004411] RSP: 0018:ffff88007c97fd48  EFLAGS: 00010202
> [ 5463.009701] RAX: 0000000000000001 RBX: 0000000000000064 RCX: 0000000000000001
> [ 5463.016818] RDX: 0000000000000064 RSI: 0000000000000000 RDI: 0000000000000000
> [ 5463.023926] RBP: ffff88007c97fd98 R08: 0000000000000000 R09: ffff88022ffd9d80
> [ 5463.031033] R10: 0000000000003189 R11: 0000000000000000 R12: 00000001004ee87e
> [ 5463.038140] R13: 0000000000000002 R14: 0000000000000000 R15: ffff88022ffd9000
> [ 5463.045258] FS:  00007f3e570de740(0000) GS:ffff88022fcc0000(0000) knlGS:0000000000000000
> [ 5463.053317] CS:  0010 DS: 0000 ES: 0000 CR0: 000000008005003b
> [ 5463.059041] CR2: 0000000000000500 CR3: 00000000018dc000 CR4: 00000000000007e0
> [ 5463.066157] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
> [ 5463.073276] DR3: 0000000000000000 DR6: 00000000ffff0ff0 DR7: 0000000000000400
> [ 5463.080400] Process kswapd3 (pid: 117, threadinfo ffff88007c97e000, task ffff88007c981970)
> [ 5463.088633] Stack:
> [ 5463.090646]  ffff88007c97fd98 0000000000000000 ffff88007c981970 ffffffff81086080
> [ 5463.098090]  ffff88007c97fd68 ffff88007c97fd68 ffff88022ffd9d80 0000000000000002
> [ 5463.105527]  0000000000000002 0000000000000000 ffff88007c97feb8 ffffffff8114b0e3
> [ 5463.112998] Call Trace:
> [ 5463.115446]  [<ffffffff81086080>] ? wake_up_bit+0x40/0x40
> [ 5463.120826]  [<ffffffff8114b0e3>] kswapd+0x6c3/0xa50
> [ 5463.125775]  [<ffffffff8114aa20>] ? zone_reclaim+0x270/0x270
> [ 5463.131415]  [<ffffffff81085680>] kthread+0xc0/0xd0
> [ 5463.136278]  [<ffffffff810855c0>] ? kthread_create_on_node+0x120/0x120
> [ 5463.142786]  [<ffffffff8160a0ac>] ret_from_fork+0x7c/0xb0
> [ 5463.148166]  [<ffffffff810855c0>] ? kthread_create_on_node+0x120/0x120
> [ 5463.154668] Code: 4e 6d 88 00 48 c7 45 b8 00 00 00 00 48 83 c0 18 48 c7 45 c8 80 60 08 81 48 89 45 d0 48 89 45 d8 8b 04 b5 a0 9a cd 81 85 c0 74 0f <48> 8b 87 00 05 00 00 a8 04 0f 85 98 00 00 00 e8 b3 c3
> [ 5463.174097] RIP  [<ffffffff811542d9>] wait_iff_congested+0x59/0x140
> [ 5463.180352]  RSP <ffff88007c97fd48>
> [ 5463.183824] CR2: 0000000000000500
> [ 5463.203717] ---[ end trace 9ff4ff9087c13a36 ]---
> 
> I attached the config file, hope it can make some help.
> 
> Thanks,
> Zhouping
> 

Thank you for the report Zhouping!

Would you be so kind to test the following patch and report results? Apply the patch to the latest mainline.

Thanks,

diff --git a/mm/vmscan.c b/mm/vmscan.c
index 23291b9..e55ce55 100644
--- a/mm/vmscan.c
+++ b/mm/vmscan.c
@@ -2564,6 +2564,7 @@ static bool prepare_kswapd_sleep(pg_data_t *pgdat, int order, long remaining,
 static unsigned long balance_pgdat(pg_data_t *pgdat, int order,
 							int *classzone_idx)
 {
+	bool pgdat_is_balanced = false;
 	struct zone *unbalanced_zone;
 	int i;
 	int end_zone = 0;	/* Inclusive.  0 = ZONE_DMA */
@@ -2638,8 +2639,11 @@ loop_again:
 				zone_clear_flag(zone, ZONE_CONGESTED);
 			}
 		}
-		if (i < 0)
+
+		if (i < 0) {
+			pgdat_is_balanced = true;
 			goto out;
+		}
 
 		for (i = 0; i <= end_zone; i++) {
 			struct zone *zone = pgdat->node_zones + i;
@@ -2766,8 +2770,11 @@ loop_again:
 				pfmemalloc_watermark_ok(pgdat))
 			wake_up(&pgdat->pfmemalloc_wait);
 
-		if (pgdat_balanced(pgdat, order, *classzone_idx))
+		if (pgdat_balanced(pgdat, order, *classzone_idx)) {
+			pgdat_is_balanced = true;
 			break;		/* kswapd: all done */
+		}
+
 		/*
 		 * OK, kswapd is getting into trouble.  Take a nap, then take
 		 * another pass across the zones.
@@ -2775,7 +2782,7 @@ loop_again:
 		if (total_scanned && (sc.priority < DEF_PRIORITY - 2)) {
 			if (has_under_min_watermark_zone)
 				count_vm_event(KSWAPD_SKIP_CONGESTION_WAIT);
-			else
+			else if (unbalanced_zone)
 				wait_iff_congested(unbalanced_zone, BLK_RW_ASYNC, HZ/10);
 		}
 
@@ -2788,9 +2795,9 @@ loop_again:
 		if (sc.nr_reclaimed >= SWAP_CLUSTER_MAX)
 			break;
 	} while (--sc.priority >= 0);
-out:
 
-	if (!pgdat_balanced(pgdat, order, *classzone_idx)) {
+out:
+	if (!pgdat_is_balanced) {
 		cond_resched();
 
 		try_to_freeze();

-- 
Zlatko

  parent reply	other threads:[~2012-12-27 14:58 UTC|newest]

Thread overview: 81+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
     [not found] <1621091901.34838094.1356409676820.JavaMail.root@redhat.com>
2012-12-25  4:38 ` kernel BUG at mm/huge_memory.c:1798! Zhouping Liu
2012-12-25  4:38   ` Zhouping Liu
2012-12-25 12:05   ` Hillf Danton
2012-12-25 12:05     ` Hillf Danton
2012-12-26  2:55     ` Zhouping Liu
2012-12-26  2:55       ` Zhouping Liu
2012-12-27  0:31     ` Alexander Beregalov
2012-12-27  0:31       ` Alexander Beregalov
2012-12-27 12:12       ` Hillf Danton
2012-12-27 12:12         ` Hillf Danton
2012-12-27 16:08     ` Alex Xu
2012-12-27 16:08       ` Alex Xu
2012-12-29  7:22       ` Hillf Danton
2012-12-29  7:22         ` Hillf Danton
2012-12-26 11:22   ` BUG: unable to handle kernel NULL pointer dereference at 0000000000000500 Zhouping Liu
2012-12-26 12:01     ` Hillf Danton
2012-12-26 12:01       ` Hillf Danton
2012-12-26 13:24       ` Zhouping Liu
2012-12-26 13:24         ` Zhouping Liu
2012-12-26 15:14         ` Hillf Danton
2012-12-26 15:14           ` Hillf Danton
2012-12-26 14:59     ` Zlatko Calusic
2012-12-26 14:59       ` Zlatko Calusic
2012-12-27 14:58     ` Zlatko Calusic [this message]
2012-12-27 14:58       ` Zlatko Calusic
2012-12-27 23:55       ` David R. Piegdon
2012-12-28  0:09         ` Zlatko Calusic
2012-12-28  0:09           ` Zlatko Calusic
2012-12-28  2:45       ` Zhouping Liu
2012-12-28  2:45         ` Zhouping Liu
2012-12-28  2:48         ` Zhouping Liu
2012-12-28  2:48           ` Zhouping Liu
2012-12-28  9:01         ` Zhouping Liu
2012-12-28  9:01           ` Zhouping Liu
2012-12-28 13:43           ` Zlatko Calusic
2012-12-28 13:43             ` Zlatko Calusic
2012-12-28 12:57         ` Zlatko Calusic
2012-12-28 12:57           ` Zlatko Calusic
2013-01-03 17:57   ` kernel BUG at mm/huge_memory.c:1798! Mel Gorman
2013-01-03 17:57     ` Mel Gorman
2013-01-04 14:08     ` [PATCH] mm: thp: Acquire the anon_vma rwsem for lock during split Mel Gorman
2013-01-04 14:08       ` Mel Gorman
2013-01-04 21:28       ` Hugh Dickins
2013-01-04 21:28         ` Hugh Dickins
2013-01-07 14:36         ` Mel Gorman
2013-01-07 14:36           ` Mel Gorman
2013-01-07 14:39         ` [PATCH] mm: thp: Acquire the anon_vma rwsem for write " Mel Gorman
2013-01-07 14:39           ` Mel Gorman
2013-01-05  1:32       ` [PATCH] mm: thp: Acquire the anon_vma rwsem for lock " Michel Lespinasse
2013-01-05  1:32         ` Michel Lespinasse
2013-01-05 12:24         ` Simon Jeons
2013-01-05 12:24           ` Simon Jeons
2013-01-07 15:09           ` Mel Gorman
2013-01-07 15:09             ` Mel Gorman
2013-01-07 15:08         ` Mel Gorman
2013-01-07 15:08           ` Mel Gorman
2013-01-05  5:51       ` Zhouping Liu
2013-01-05  5:51         ` Zhouping Liu
2013-01-07 14:38         ` Mel Gorman
2013-01-07 14:38           ` Mel Gorman
2013-01-05 12:21       ` Simon Jeons
2013-01-05 12:21         ` Simon Jeons
2013-01-04 16:58     ` kernel BUG at mm/huge_memory.c:1798! Zhouping Liu
2013-01-04 16:58       ` Zhouping Liu
2012-12-27 23:30 BUG: unable to handle kernel NULL pointer dereference at 0000000000000500 Sedat Dilek
2012-12-27 23:39 ` Zlatko Calusic
2012-12-27 23:39   ` Zlatko Calusic
2012-12-27 23:42   ` Sedat Dilek
2012-12-27 23:42     ` Sedat Dilek
2012-12-27 23:51     ` Zlatko Calusic
2012-12-27 23:51       ` Zlatko Calusic
2012-12-28  0:24       ` Sedat Dilek
2012-12-28  0:24         ` Sedat Dilek
2012-12-28  0:33         ` Zlatko Calusic
2012-12-28  0:33           ` Zlatko Calusic
2012-12-28  0:37           ` Sedat Dilek
2012-12-28  0:37             ` Sedat Dilek
2012-12-28  0:42             ` Zlatko Calusic
2012-12-28  0:42               ` Zlatko Calusic
2012-12-28  0:56               ` Sedat Dilek
2012-12-28  0:56                 ` Sedat Dilek

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=50DC622B.7000802@iskon.hr \
    --to=zlatko.calusic@iskon.hr \
    --cc=aarcange@redhat.com \
    --cc=dhillf@gmail.com \
    --cc=hughd@google.com \
    --cc=jweiner@redhat.com \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-mm@kvack.org \
    --cc=mgorman@suse.de \
    --cc=mingo@redhat.com \
    --cc=zliu@redhat.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.