From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from mail172.messagelabs.com (mail172.messagelabs.com [216.82.254.3]) by kanga.kvack.org (Postfix) with ESMTP id AC6106001DA for ; Wed, 3 Feb 2010 17:41:31 -0500 (EST) Date: Wed, 3 Feb 2010 14:39:21 -0800 From: Andrew Morton Subject: Re: [Bugme-new] [Bug 15214] New: Oops at __rmqueue+0x51/0x2b3 Message-Id: <20100203143921.f2c96e8c.akpm@linux-foundation.org> In-Reply-To: References: Mime-Version: 1.0 Content-Type: text/plain; charset=US-ASCII Content-Transfer-Encoding: 7bit Sender: owner-linux-mm@kvack.org To: linux-mm@kvack.org Cc: bugzilla-daemon@bugzilla.kernel.org, bugme-daemon@bugzilla.kernel.org, Mel Gorman , Johannes Weiner , ajlill@ajlc.waterloo.on.ca List-ID: (switched to email. Please respond via emailed reply-to-all, not via the bugzilla web interface). On Wed, 3 Feb 2010 02:30:22 GMT bugzilla-daemon@bugzilla.kernel.org wrote: > http://bugzilla.kernel.org/show_bug.cgi?id=15214 > > Summary: Oops at __rmqueue+0x51/0x2b3 > Product: Memory Management > Version: 2.5 > Kernel Version: 2.6.32.7 > Platform: All > OS/Version: Linux > Tree: Mainline > Status: NEW > Severity: normal > Priority: P1 > Component: Page Allocator > AssignedTo: akpm@linux-foundation.org > ReportedBy: ajlill@ajlc.waterloo.on.ca > Regression: Yes > > > Created an attachment (id=24887) > --> (http://bugzilla.kernel.org/attachment.cgi?id=24887) > .config file > > I get an Oops when doing a lot of filesystem reads. The process, cfagent, is > running through the filesystem checksumming files when it dies. It doesn't > happen every time cfagent runs, but there's a pretty good chance it will. > This problem happens on 2.6.31.* as well, 3.6.30.10 appears to be stable. It > happens on two different computers, so it's unlikely to be hardware. Also, in > 2.6.32.*, I get an Oops at > > BUG_ON(page_zone(start_page) != page_zone(end_page)); > > in move_freepages when I do sysctl -w vm.min_free_kbytes=16384 > > but I can only reliably reproduce it when I do the sysctl from the boot > scripts, and I'm having trouble getting netconsole started beforehand to > capture the full output. > > gcc (GCC) 4.1.2 20061115 (prerelease) (Debian 4.1.1-21) > > Full text of Oops: > > BUG: unable to handle kernel paging request at 6eae67fc > IP: [] __rmqueue+0x51/0x2b3 > *pdpt = 00000000351be001 *pde = 0000000000000000 > Oops: 0002 [#1] SMP > last sysfs file: /sys/class/firmware/0000:00:0b.0/loading > Modules linked in: netconsole af_packet autofs4 nfsd nfs lockd fscache nfs_acl > auth_rpcgss sunrpc ipv6 nls_iso8859_1 nls_cp437 vfat fat xfs exportfs fuse > configfs dm_snapshot dm_mirror dm_region_hash dm_log dm_mod eeprom w83781d > hwmon_vid hwmon r128 drm tuner_simple tuner_types tuner msp3400 saa7115 button > processor ivtv i2c_algo_bit cx2341x v4l2_common videodev psmouse parport_pc > v4l1_compat rtc_cmos parport tveeprom i2c_piix4 rtc_core intel_agp serio_raw > rtc_lib agpgart i2c_core shpchp pci_hotplug pcspkr evdev ext3 jbd mbcache raid1 > sg sr_mod sd_mod cdrom crc_t10dif ata_generic pata_acpi pata_pdc202xx_old > ata_piix floppy e1000 uhci_hcd libata thermal fan unix [last unloaded: > scsi_wait_scan] > > Pid: 6629, comm: cfagent Not tainted (2.6.32.7 #1) System Name > EIP: 0060:[] EFLAGS: 00210002 CPU: 0 > EIP is at __rmqueue+0x51/0x2b3 > EAX: c146a018 EBX: 0000000a ECX: 6eae67f8 EDX: c050b654 > ESI: c050b644 EDI: 00200246 EBP: f51c9d1c ESP: f51c9cec > DS: 007b ES: 007b FS: 00d8 GS: 00e0 SS: 0068 > Process cfagent (pid: 6629, ti=f51c8000 task=f51b40b0 task.ti=f51c8000) > Stack: > 00000002 00000000 c050b260 00000001 f6ba8280 00200002 c0193c92 c019404e > <0> c146a000 c1479ff8 c050b260 00200246 f51c9d78 c0193cd5 f51c9d7c 00000002 > <0> 00000000 00000000 000201da c050c16c 00000000 c050b280 00000001 0000001f > Call Trace: > [] ? get_page_from_freelist+0xdf/0x3a8 > [] ? __alloc_pages_nodemask+0xdd/0x481 > [] ? get_page_from_freelist+0x122/0x3a8 > [] ? __alloc_pages_nodemask+0xdd/0x481 > [] ? _d_rehash+0x3c/0x40 > [] ? __do_page_cache_readahead+0x80/0x15b > [] ? __d_lookup+0xa1/0xd5 > [] ? ra_submit+0x17/0x1c > [] ? ondemand_readahead+0x150/0x15c > [] ? page_cache_sync_readahead+0x16/0x1b > [] ? generic_file_aio_read+0x212/0x507 > [] ? do_sync_read+0xab/0xe9 > [] ? mmap_region+0x25b/0x334 > [] ? autoremove_wake_function+0x0/0x33 > [] ? security_file_permission+0xf/0x11 > [] ? do_sync_read+0x0/0xe9 > [] ? vfs_read+0x8a/0x13f > [] ? sys_read+0x3b/0x60 > [] ? sysenter_do_call+0x12/0x27 > Code: 2c c1 e1 03 8d 94 30 20 02 00 00 e9 8a 00 00 00 8d 72 0c 8d 04 0e 39 00 > 74 7c 8b 55 d0 8b 04 d6 8d 48 e8 89 4d f0 8b 08 8b 50 04 <89> 51 04 89 0a c7 40 > 04 00 02 20 00 c7 00 00 01 10 00 0f ba 70 > EIP: [] __rmqueue+0x51/0x2b3 SS:ESP 0068:f51c9cec > CR2: 000000006eae67fc > ---[ end trace db0096b2091950d0 ]--- > Strange regression. I'd be suspecting that we've mucked up the initial mem_map, perhaps because of a wart in the e820 or acpi tables. Or perhaps it's something else. -- To unsubscribe, send a message with 'unsubscribe linux-mm' in the body to majordomo@kvack.org. For more info on Linux MM, see: http://www.linux-mm.org/ . Don't email: email@kvack.org