From: Martin Mokrejs <mmokrejs@fold.natur.cuni.cz>
To: Hugh Dickins <hughd@google.com>
Cc: Martin Mokrejs <mmokrejs@fold.natur.cuni.cz>,
LKML <linux-kernel@vger.kernel.org>,
"linux-mm@kvack.org" <linux-mm@kvack.org>,
Greg KH <gregkh@linuxfoundation.org>, Tejun Heo <tj@kernel.org>
Subject: Re: linux-3.7.1: OOPS in page_lock_anon_vma
Date: Fri, 11 Jan 2013 12:54:42 +0100 [thread overview]
Message-ID: <50EFFD82.4030408@fold.natur.cuni.cz> (raw)
In-Reply-To: <alpine.LNX.2.00.1301061616110.6198@eggly.anvils>
[-- Attachment #1: Type: text/plain, Size: 7460 bytes --]
Hugh Dickins wrote:
> On Sun, 6 Jan 2013, Martin Mokrejs wrote:
>
>> I was running 3.7.1 kernel quite fine for a while but I realized that it is slow and that
>> I should go and drop useless kernel drivers from my kernel. I have a SandyBridge-based
>> laptop and I found that I gain speed while setting CONFIG_NO_HZ=y, CONFIG_PREEMPT_NONE=y,
>> removing multicore scheduler, asking configurator set set maximum amount of CPUs for my
>> system (and not blindly specifying 4 for my dual-core i7 processor).
>> Further I get faster system while removing IOMMU and DMA redirects while it still
>> emulates NUMA. And, I switched away from CFQ scheduler to deadline and from SLAB to SLUB.
>> Finally, to make sure my CPU cores do not go back and forth between C0 and C7 states and
>> shutdown dynamically the 2 hyperthreaded cores. So I have really only two, physical cores
>> accessible. With performance CPU governor I have 1/2 of context switches and both cores
>> can be satured by whatever jobs (kernel compile or some computational jobs). It was not
>> possible to get the CPU running at turbo speed for a long while as it always went down
>> time to time. With ondemand governor I had cores in C7 for 50-70% of the time, that was
>> a bit better with performance governor but having the two hyperthreaded cores disabled
>> reduced the context switches by half, rescheduling interrupts went down by several orders
>> of magnitute. So it is crunching at max turbo speed on both cores, temp about 80 oC.
>>
>> I think none of the changes relates to the kernel crash directly but I had not a single crash
>> with 3.7.1 for few weeks. After the tweaks I had 3-4 crashes this afternoon. The system always
>> locked up so I could not see anything. Luckily, be it actually the same crash or not, now my X11
>> screen was dropped and to my framebuffer console and I got to see a kernel stacktrace. Here
>> is the first, fished out from /var/log/messages upon next bootup:
>>
>>
>> Jan 6 22:37:29 vostro kernel: [ 7663.251110] general protection fault: 0000 [#1] SMP
>> Jan 6 22:37:29 vostro kernel: [ 7663.251135] Modules linked in: i915 fbcon bitblit cfbfillrect softcursor cfbimgblt i2c_algo_bit font cfbcopyarea drm_kms_helper drm fb iwldvm iwlwifi fbdev sata_sil24
>> Jan 6 22:37:29 vostro kernel: [ 7663.251197] CPU 1
>> Jan 6 22:37:29 vostro kernel: [ 7663.251206] Pid: 795, comm: kswapd0 Not tainted 3.7.1-default #22 Dell Inc. Vostro 3550/
>> Jan 6 22:37:29 vostro kernel: [ 7663.251229] RIP: 0010:[<ffffffff815d3dee>] [<ffffffff815d3dee>] mutex_trylock+0xb/0x26
>> Jan 6 22:37:29 vostro kernel: [ 7663.251257] RSP: 0018:ffff88040d25bbb8 EFLAGS: 00010246
>> Jan 6 22:37:29 vostro kernel: [ 7663.251273] RAX: 0000000000000001 RBX: ffff88040bfdc000 RCX: ffff88040d25bce8
>> Jan 6 22:37:29 vostro kernel: [ 7663.251293] RDX: 0000000000000000 RSI: 0000000000000000 RDI: 0720072007200728
>> Jan 6 22:37:29 vostro kernel: [ 7663.251313] RBP: ffff88040d25bbb8 R08: dead000000200200 R09: dead000000100100
>> Jan 6 22:37:29 vostro kernel: [ 7663.251333] R10: ffff88040d25bc38 R11: ffff8804078acec0 R12: ffff88040bfdc001
>> Jan 6 22:37:29 vostro kernel: [ 7663.251354] R13: ffffea0010137440 R14: 0720072007200728 R15: 0000000000000001
>> Jan 6 22:37:29 vostro kernel: [ 7663.251374] FS: 0000000000000000(0000) GS:ffff88041fa80000(0000) knlGS:0000000000000000
>> Jan 6 22:37:29 vostro kernel: [ 7663.251396] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
>> Jan 6 22:37:29 vostro kernel: [ 7663.251413] CR2: 00002b876c545978 CR3: 00000000018f6000 CR4: 00000000000407e0
>> Jan 6 22:37:29 vostro kernel: [ 7663.251432] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
>> Jan 6 22:37:29 vostro kernel: [ 7663.251452] DR3: 0000000000000000 DR6: 00000000ffff0ff0 DR7: 0000000000000400
>> Jan 6 22:37:29 vostro kernel: [ 7663.251472] Process kswapd0 (pid: 795, threadinfo ffff88040d25a000, task ffff88040d07ce30)
>> Jan 6 22:37:29 vostro kernel: [ 7663.251494] Stack:
>> Jan 6 22:37:29 vostro kernel: [ 7663.251501] ffff88040d25bbe8 ffffffff810f6994 ffffea0010137440 0000000000000000
>> Jan 6 22:37:29 vostro kernel: [ 7663.251527] ffff88040d25bde8 ffff88041fddad00 ffff88040d25bc58 ffffffff810f6b9e
>> Jan 6 22:37:29 vostro kernel: [ 7663.251551] 0000000000000000 ffff8804046d2dc0 00000000810dee97 ffff88040d25bce8
>> Jan 6 22:37:29 vostro kernel: [ 7663.251576] Call Trace:
>> Jan 6 22:37:29 vostro kernel: [ 7663.251587] [<ffffffff810f6994>] page_lock_anon_vma+0x40/0xaf
>> Jan 6 22:37:29 vostro kernel: [ 7663.251605] [<ffffffff810f6b9e>] page_referenced+0x78/0x1b7
>> Jan 6 22:37:29 vostro kernel: [ 7663.251623] [<ffffffff810e026a>] shrink_active_list+0x209/0x305
>> Jan 6 22:37:29 vostro kernel: [ 7663.251641] [<ffffffff810e1269>] kswapd+0x3fe/0x8ea
>> Jan 6 22:37:29 vostro kernel: [ 7663.251658] [<ffffffff81091697>] ? wake_up_bit+0x25/0x25
>> Jan 6 22:37:29 vostro kernel: [ 7663.251675] [<ffffffff810e0e6b>] ? try_to_free_pages+0x8c/0x8c
>> Jan 6 22:37:29 vostro kernel: [ 7663.251692] [<ffffffff81091120>] kthread+0x90/0x98
>> Jan 6 22:37:29 vostro kernel: [ 7663.251707] [<ffffffff81091090>] ? kthread_freezable_should_stop+0x3c/0x3c
>> Jan 6 22:37:29 vostro kernel: [ 7663.251727] [<ffffffff815d5dec>] ret_from_fork+0x7c/0xb0
>> Jan 6 22:37:29 vostro kernel: [ 7663.251743] [<ffffffff81091090>] ? kthread_freezable_should_stop+0x3c/0x3c
>> Jan 6 22:37:29 vostro kernel: [ 7663.251762] Code: 8d 53 08 c7 03 01 00 00 00 48 39 d0 74 09 48 8b 78 10 e8 a0 79 ac ff 66 83 43 04 01 5a 5b c9 c3 55 b8 01 00 00 00 48 89 e5 31 d2 <f0> 0f b1 17 ff c8 75 0f 65 48 8b 04 25 00 b8 00 00 b2 01 48 89
>> Jan 6 22:37:29 vostro kernel: [ 7663.251898] RIP [<ffffffff815d3dee>] mutex_trylock+0xb/0x26
>> Jan 6 22:37:29 vostro kernel: [ 7663.251916] RSP <ffff88040d25bbb8>
>> Jan 6 22:37:29 vostro kernel: [ 7663.471083] ---[ end trace 15db67145b2c838a ]---
>> Jan 6 22:37:39 vostro kernel: [ 7672.954999] SysRq : Emergency Sync
>>
>>
>>
>> It seemed the kernel was still running, disk was doing some work and CPU fan was changing its speed.
>> I then pressed alt+sysrq+i and got (retyped from a camera picture which is attached as this one was
>> not in /var/log/messages):
>>
>> lock_anon_vma_root.clone
>> unlink_anon_vmas
>> free_pgtables
>> exit_mmap
>> mmput
>> exit_mm
>> do_exit
>> ? recalc_sigpending_tsk
>> do_group_exit
>> get_signal_to_deliver
>> do_signal
>> ? timespec_add_safe
>> ? __fput
>> do_notify_resume
>> int_signal
>>
>> But the system was dead, I had to turn off the power.
>>
>>
>> Any clues? What kernel .config item should I enable/disable to avoid it in the future? ;-)
>> Thank you,
>> Martin
>
> One of your struct anon_vmas seems to have been overwritten with 0x0720s.
> I've no idea why. But since you mention you've put SLUB in, best to take
> advantage of it by rebooting with slub_debug=AFPZ and see if that shows
> up anything interesting.
I can only add that since the changes to my .config (see attached diff) I did not get a crash
in last days. However, two kmemleaks were reported to me by kernel and although I passed them
to linux-pci and lkml I got no answer. Actually, have no idea if they could be related or not
to the issue or whether some of those debug or lock correctness options fixed that inadverently.
Why had there CPUS=4096 in the past and CONFIG_SPLIT_PTLOCK_CPUS=999999 at the moment I do not know.
Looks silly on a dual-core i7 laptop.
Martin
[-- Attachment #2: .config.diff --]
[-- Type: text/plain, Size: 2882 bytes --]
--- /tmp/.config-bad 2013-01-11 12:47:05.000000000 +0100
+++ /tmp/.config-good 2013-01-11 12:49:51.000000000 +0100
@@ -194,12 +194,7 @@
CONFIG_DEFAULT_IOSCHED="deadline"
CONFIG_PADATA=y
CONFIG_ASN1=y
-CONFIG_INLINE_SPIN_UNLOCK_IRQ=y
-CONFIG_INLINE_READ_UNLOCK=y
-CONFIG_INLINE_READ_UNLOCK_IRQ=y
-CONFIG_INLINE_WRITE_UNLOCK=y
-CONFIG_INLINE_WRITE_UNLOCK_IRQ=y
-CONFIG_MUTEX_SPIN_ON_OWNER=y
+CONFIG_UNINLINE_SPIN_UNLOCK=y
CONFIG_FREEZER=y
CONFIG_ZONE_DMA=y
@@ -238,7 +233,6 @@
CONFIG_X86_MCE_INTEL=y
CONFIG_X86_MCE_THRESHOLD=y
CONFIG_X86_THERMAL_VECTOR=y
-CONFIG_I8K=y
CONFIG_MICROCODE=y
CONFIG_MICROCODE_INTEL=y
CONFIG_MICROCODE_OLD_INTERFACE=y
@@ -269,7 +263,7 @@
CONFIG_HAVE_MEMBLOCK_NODE_MAP=y
CONFIG_ARCH_DISCARD_MEMBLOCK=y
CONFIG_PAGEFLAGS_EXTENDED=y
-CONFIG_SPLIT_PTLOCK_CPUS=4
+CONFIG_SPLIT_PTLOCK_CPUS=999999
CONFIG_COMPACTION=y
CONFIG_MIGRATION=y
CONFIG_PHYS_ADDR_T_64BIT=y
@@ -278,8 +272,6 @@
CONFIG_VIRT_TO_BUS=y
CONFIG_DEFAULT_MMAP_MIN_ADDR=4096
CONFIG_ARCH_SUPPORTS_MEMORY_FAILURE=y
-CONFIG_TRANSPARENT_HUGEPAGE=y
-CONFIG_TRANSPARENT_HUGEPAGE_MADVISE=y
CONFIG_CLEANCACHE=y
CONFIG_FRONTSWAP=y
CONFIG_X86_RESERVE_LOW=64
@@ -558,7 +550,6 @@
CONFIG_CB710_DEBUG_ASSUMPTIONS=y
-CONFIG_INTEL_MEI=y
CONFIG_HAVE_IDE=y
CONFIG_SCSI_MOD=y
@@ -902,9 +893,13 @@
CONFIG_CHARGER_GPIO=m
CONFIG_CHARGER_SMB347=m
CONFIG_HWMON=y
+CONFIG_HWMON_VID=m
CONFIG_HWMON_DEBUG_CHIP=y
+CONFIG_SENSORS_GPIO_FAN=m
CONFIG_SENSORS_CORETEMP=y
+CONFIG_SENSORS_IT87=m
+CONFIG_SENSORS_JC42=m
CONFIG_SENSORS_LTC4215=m
CONFIG_SENSORS_ACPI_POWER=m
@@ -1490,22 +1485,32 @@
CONFIG_FRAME_WARN=2048
CONFIG_MAGIC_SYSRQ=y
CONFIG_DEBUG_FS=y
+CONFIG_HEADERS_CHECK=y
CONFIG_DEBUG_KERNEL=y
CONFIG_DEBUG_SHIRQ=y
CONFIG_LOCKUP_DETECTOR=y
CONFIG_HARDLOCKUP_DETECTOR=y
-CONFIG_BOOTPARAM_HARDLOCKUP_PANIC=y
-CONFIG_BOOTPARAM_HARDLOCKUP_PANIC_VALUE=1
-CONFIG_BOOTPARAM_SOFTLOCKUP_PANIC=y
-CONFIG_BOOTPARAM_SOFTLOCKUP_PANIC_VALUE=1
+CONFIG_BOOTPARAM_HARDLOCKUP_PANIC_VALUE=0
+CONFIG_BOOTPARAM_SOFTLOCKUP_PANIC_VALUE=0
CONFIG_PANIC_ON_OOPS_VALUE=0
CONFIG_DETECT_HUNG_TASK=y
CONFIG_DEFAULT_HUNG_TASK_TIMEOUT=120
-CONFIG_BOOTPARAM_HUNG_TASK_PANIC=y
-CONFIG_BOOTPARAM_HUNG_TASK_PANIC_VALUE=1
+CONFIG_BOOTPARAM_HUNG_TASK_PANIC_VALUE=0
CONFIG_SCHED_DEBUG=y
CONFIG_SCHEDSTATS=y
+CONFIG_SLUB_DEBUG_ON=y
CONFIG_HAVE_DEBUG_KMEMLEAK=y
+CONFIG_DEBUG_KMEMLEAK=y
+CONFIG_DEBUG_KMEMLEAK_EARLY_LOG_SIZE=400
+CONFIG_DEBUG_RT_MUTEXES=y
+CONFIG_DEBUG_PI_LIST=y
+CONFIG_DEBUG_SPINLOCK=y
+CONFIG_DEBUG_MUTEXES=y
+CONFIG_DEBUG_LOCK_ALLOC=y
+CONFIG_PROVE_LOCKING=y
+CONFIG_LOCKDEP=y
+CONFIG_TRACE_IRQFLAGS=y
+CONFIG_STACKTRACE=y
CONFIG_DEBUG_BUGVERBOSE=y
CONFIG_DEBUG_INFO=y
CONFIG_DEBUG_MEMORY_INIT=y
@@ -1536,6 +1541,7 @@
CONFIG_DEFAULT_IO_DELAY_TYPE=0
CONFIG_DEBUG_BOOT_PARAMS=y
CONFIG_OPTIMIZE_INLINING=y
+CONFIG_DEBUG_STRICT_USER_COPY_CHECKS=y
CONFIG_KEYS=y
CONFIG_SECURITYFS=y
next prev parent reply other threads:[~2013-01-11 11:55 UTC|newest]
Thread overview: 5+ messages / expand[flat|nested] mbox.gz Atom feed top
2013-01-06 22:59 linux-3.7.1: OOPS in page_lock_anon_vma Martin Mokrejs
2013-01-07 0:21 ` Hugh Dickins
2013-01-11 11:54 ` Martin Mokrejs [this message]
2013-01-07 12:32 ` Hillf Danton
2013-01-07 13:17 ` Martin Mokrejs
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=50EFFD82.4030408@fold.natur.cuni.cz \
--to=mmokrejs@fold.natur.cuni.cz \
--cc=gregkh@linuxfoundation.org \
--cc=hughd@google.com \
--cc=linux-kernel@vger.kernel.org \
--cc=linux-mm@kvack.org \
--cc=tj@kernel.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).