From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S262601AbVGMImN (ORCPT ); Wed, 13 Jul 2005 04:42:13 -0400 Received: (majordomo@vger.kernel.org) by vger.kernel.org id S262602AbVGMImN (ORCPT ); Wed, 13 Jul 2005 04:42:13 -0400 Received: from mx.wurtel.net ([195.64.88.114]:50443 "EHLO mx.wurtel.net") by vger.kernel.org with ESMTP id S262601AbVGMImJ (ORCPT ); Wed, 13 Jul 2005 04:42:09 -0400 Date: Wed, 13 Jul 2005 10:41:53 +0200 From: Paul Slootman To: linux-kernel@vger.kernel.org Subject: PROBLEM: Oops when running mkreiserfs on large (9TB) raid0 set on AMD64 SMP Message-ID: <20050713084152.GA5765@wurtel.net> Mail-Followup-To: linux-kernel@vger.kernel.org Mime-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline User-Agent: Mutt/1.5.6i X-Scanner: exiscan *1DscoT-0005jo-00*gP/MRSCF1Ew*Wurtel Sender: linux-kernel-owner@vger.kernel.org X-Mailing-List: linux-kernel@vger.kernel.org After having installed a base system (Debian) on our shiny new AMD64 SMP system, with 2 x 3ware 9000 SATA controllers for a total of 24 disks (excluding the 2 connected to the motherboard), I wanted as a "quick" test to put them all in a raid0 and put a reiserfs on it (just to see the 'df' output :-). It all went OK up to the mkreiserfs, when an Oops happened, mkreiserfs stopped doing anything useful (besides eating system time in state 'D', untraceable). The system slowed literally to a crawl, typing in stuff over the network lagged by 1-6 seconds. Root, swap is on LVM2 over raid1 over 2 SATA disks connected to the Tyan motherboard (SiI 3114 controller). That works fine. lvm, raid1, sata_sil are built into the kernel; 3w_9000 and raid0 are modules. Note that the system time was a couple of weeks in the future... Paul Slootman # cat /proc/version Linux version 2.6.12 (root@zaadzilla) (gcc version 3.3.5 (Debian 1:3.3.5-13)) #1 SMP Fri Aug 5 20:36:51 CEST 2005 Output from kern.log: Aug 9 20:08:37 localhost kernel: raid0: done. Aug 9 20:08:37 localhost kernel: raid0 : md_size is 9374734848 blocks. Aug 9 20:08:37 localhost kernel: raid0 : conf->hash_spacing is 9374734848 blocks. Aug 9 20:08:37 localhost kernel: raid0 : nb_zone is 12. Aug 9 20:08:37 localhost kernel: raid0 : Allocating 96 bytes for hash. Aug 9 20:09:18 localhost kernel: Unable to handle kernel NULL pointer dereference at 0000000000000028 RIP: Aug 9 20:09:18 localhost kernel: {:raid0:raid0_make_request+472} Aug 9 20:09:18 localhost kernel: PGD f3d16067 PUD f387e067 PMD 0 Aug 9 20:09:18 localhost kernel: Oops: 0000 [1] SMP Aug 9 20:09:18 localhost kernel: CPU 1 Aug 9 20:09:18 localhost kernel: Modules linked in: raid0 ipv6 evdev tg3 3w_9xxx hw_random i2c_amd756 i2c_amd8111 i2c_core psmouse rtc Aug 9 20:09:18 localhost kernel: Pid: 8901, comm: mkreiserfs Not tainted 2.6.12 Aug 9 20:09:18 localhost kernel: RIP: 0010:[__nosave_end+129608600/2131804160] {:raid0:raid0_make_request+472} Aug 9 20:09:18 localhost kernel: RSP: 0018:ffff81007ec898e8 EFLAGS: 00010206 Aug 9 20:09:18 localhost kernel: RAX: 0000000000000078 RBX: ffff8100821c0500 RCX: ffff0201f361ee78 Aug 9 20:09:18 localhost kernel: RDX: 0000000000000000 RSI: 0000000000000006 RDI: ffff8100f3f0b4a8 Aug 9 20:09:18 localhost kernel: RBP: 0000000000000040 R08: 0000000008bb1c67 R09: ffff8100f9b0f700 Aug 9 20:09:18 localhost kernel: R10: 000000000000007f R11: 000000007fe93d40 R12: ffff81007fdc4940 Aug 9 20:09:18 localhost kernel: R13: ffff8100f9b15308 R14: 0000000000001000 R15: 0000000000000001 Aug 9 20:09:18 localhost kernel: FS: 00002aaaaaf04090(0000) GS:ffffffff804b6e00(0000) knlGS:0000000000000000 Aug 9 20:09:18 localhost kernel: CS: 0010 DS: 0000 ES: 0000 CR0: 000000008005003b Aug 9 20:09:18 localhost kernel: CR2: 0000000000000028 CR3: 00000000f3e80000 CR4: 00000000000006e0 Aug 9 20:09:18 localhost kernel: Process mkreiserfs (pid: 8901, threadinfo ffff81007ec88000, task ffff81007fcd1050) Aug 9 20:09:18 localhost kernel: Stack: 0000000000000001 ffff8100821c0500 ffff8100f9b15308 ffff81007ec89948 Aug 9 20:09:18 localhost kernel: ffff8100821c0500 ffffffff8025c101 0000000000000000 ffff81007fcd1050 Aug 9 20:09:18 localhost kernel: ffffffff801483b0 ffff81007ec89960 Aug 9 20:09:18 localhost kernel: Call Trace:{generic_make_request+481} {autoremove_wake_function+0} Aug 9 20:09:18 localhost kernel: {autoremove_wake_function+0} {autoremove_wake_function+0} Aug 9 20:09:18 localhost kernel: {submit_bio+223} {bio_alloc_bioset+288} Aug 9 20:09:18 localhost kernel: {submit_bh+273} {block_read_full_page+621} Aug 9 20:09:18 localhost kernel: {radix_tree_node_alloc+19} {blkdev_get_block+0} Aug 9 20:09:18 localhost kernel: {radix_tree_insert+114} {read_pages+178} Aug 9 20:09:18 localhost kernel: {buffered_rmqueue+323} {__alloc_pages+963} Aug 9 20:09:18 localhost kernel: {__do_page_cache_readahead+284} {blockable_page_cache_readahead+103} Aug 9 20:09:18 localhost kernel: {page_cache_readahead+267} {do_generic_mapping_read+343} Aug 9 20:09:18 localhost kernel: {file_read_actor+0} {__generic_file_aio_read+407} Aug 9 20:09:18 localhost kernel: {generic_file_read+194} {__wake_up+67} Aug 9 20:09:18 localhost kernel: {tty_ldisc_deref+117} {autoremove_wake_function+0} Aug 9 20:09:18 localhost kernel: {autoremove_wake_function+0} {thread_return+0} Aug 9 20:09:18 localhost kernel: {dnotify_parent+46} {vfs_read+191} Aug 9 20:09:18 localhost kernel: {sys_read+83} {system_call+126} Aug 9 20:09:18 localhost kernel: Aug 9 20:09:18 localhost kernel: Aug 9 20:09:18 localhost kernel: Code: 48 8b 42 28 48 89 43 10 b8 01 00 00 00 48 03 4a 40 48 89 0b Aug 9 20:09:18 localhost kernel: RIP {:raid0:raid0_make_request+472} RSP Aug 9 20:09:18 localhost kernel: CR2: 0000000000000028 Output of scripts/ver_linux: Linux zaadzilla 2.6.12 #1 SMP Fri Aug 5 20:36:51 CEST 2005 x86_64 GNU/Linux Gnu C 3.3.5 Gnu make 3.80 binutils 2.15 util-linux 2.12p mount 2.12p module-init-tools 3.2-pre1 e2fsprogs 1.37 reiserfsprogs 3.6.19 reiser4progs line Linux C Library 2.3.2 Dynamic linker (ldd) 2.3.2 Procps 3.2.1 Net-tools 1.60 Kbd 81: Sh-utils 5.2.1 Modules Loaded raid0 ipv6 evdev tg3 3w_9xxx hw_random i2c_amd756 i2c_amd8111 i2c_core psmouse rtc # cat /proc/cpuinfo processor : 0 vendor_id : AuthenticAMD cpu family : 15 model : 5 model name : AMD Opteron(tm) Processor 244 stepping : 10 cpu MHz : 1794.281 cache size : 1024 KB fpu : yes fpu_exception : yes cpuid level : 1 wp : yes flags : fpu vme de pse tsc msr pae mce cx8 apic sep mtrr pge mca cmov pat pse36 clflush mmx fxsr sse sse2 syscall nx mmxext lm 3dnowext 3dnow bogomips : 3522.56 TLB size : 1024 4K pages clflush size : 64 cache_alignment : 64 address sizes : 40 bits physical, 48 bits virtual power management: ts fid vid ttp processor : 1 vendor_id : AuthenticAMD cpu family : 15 model : 5 model name : AMD Opteron(tm) Processor 244 stepping : 10 cpu MHz : 1794.281 cache size : 1024 KB fpu : yes fpu_exception : yes cpuid level : 1 wp : yes flags : fpu vme de pse tsc msr pae mce cx8 apic sep mtrr pge mca cmov pat pse36 clflush mmx fxsr sse sse2 syscall nx mmxext lm 3dnowext 3dnow bogomips : 3579.90 TLB size : 1024 4K pages clflush size : 64 cache_alignment : 64 address sizes : 40 bits physical, 48 bits virtual power management: ts fid vid ttp # cat /proc/modules raid0 8192 0 - Live 0xffffffff8808e000 ipv6 285376 16 - Live 0xffffffff88047000 evdev 10560 0 - Live 0xffffffff88043000 tg3 99972 0 - Live 0xffffffff88029000 3w_9xxx 38404 0 - Live 0xffffffff8801e000 hw_random 6240 0 - Live 0xffffffff8801b000 i2c_amd756 7492 0 - Live 0xffffffff88018000 i2c_amd8111 6976 0 - Live 0xffffffff88015000 i2c_core 24384 2 i2c_amd756,i2c_amd8111, Live 0xffffffff8800e000 psmouse 31044 0 - Live 0xffffffff88005000 rtc 15056 0 - Live 0xffffffff88000000 # cat /proc/ioports 0000-001f : dma1 0020-0021 : pic1 0040-0043 : timer0 0050-0053 : timer1 0060-006f : keyboard 0070-0077 : rtc 0080-008f : dma page reg 00a0-00a1 : pic2 00c0-00df : dma2 00f0-00ff : fpu 01f0-01f7 : ide0 03c0-03df : vga+ 03f6-03f6 : ide0 03f8-03ff : serial 0cf8-0cff : PCI conf1 1000-10bf : motherboard 1000-1003 : PM1a_EVT_BLK 1004-1005 : PM1a_CNT_BLK 1008-100b : PM_TMR 1020-1023 : GPE0_BLK 10b0-10b7 : GPE1_BLK 10c0-10df : motherboard 10e0-10ff : motherboard 10e0-10ef : amd756-smbus 7000-7fff : PCI Bus #01 7800-78ff : 0000:01:01.0 7800-78ff : 3w-9xxx 8000-8fff : PCI Bus #02 8800-88ff : 0000:02:02.0 8800-88ff : 3w-9xxx 9000-bfff : PCI Bus #03 a800-a83f : 0000:03:08.0 a800-a83f : e100 a880-a88f : 0000:03:05.0 a880-a88f : sata_sil ac00-ac03 : 0000:03:05.0 ac00-ac03 : sata_sil b000-b0ff : 0000:03:06.0 b800-b807 : 0000:03:05.0 b800-b807 : sata_sil b880-b883 : 0000:03:05.0 b880-b883 : sata_sil bc00-bc07 : 0000:03:05.0 bc00-bc07 : sata_sil cc00-cc1f : 0000:00:07.2 cc00-cc1f : amd8111 SMBus 2.0 de00-de7f : motherboard de80-deff : motherboard ffa0-ffaf : 0000:00:07.1 ffa0-ffa7 : ide0 ffa8-ffaf : ide1 # cat /proc/iomem 00000000-0009efff : System RAM 0009f000-0009ffff : reserved 000a0000-000bffff : Video RAM area 000c0000-000c7fff : Video ROM 000c8000-000cc7ff : Adapter ROM 000ce000-000ce7ff : Adapter ROM 000f0000-000fffff : System ROM 00100000-f9feffff : System RAM 00100000-0032cf67 : Kernel code 0032cf68-0045d06f : Kernel data f9ff0000-f9ffefff : ACPI Tables f9fff000-f9ffffff : ACPI Non-volatile Storage fa400000-fb4fffff : PCI Bus #01 fa800000-faffffff : 0000:01:01.0 fa800000-faffffff : 3w-9xxx fb500000-fc5fffff : PCI Bus #02 fb800000-fbffffff : 0000:02:02.0 fb800000-fbffffff : 3w-9xxx fc700000-fc7fffff : PCI Bus #01 fc7ffc00-fc7ffcff : 0000:01:01.0 fc7ffc00-fc7ffcff : 3w-9xxx fc800000-fc8fffff : PCI Bus #02 fc890000-fc89ffff : 0000:02:09.0 fc890000-fc89ffff : tg3 fc8a0000-fc8affff : 0000:02:09.0 fc8a0000-fc8affff : tg3 fc8c0000-fc8cffff : 0000:02:09.1 fc8c0000-fc8cffff : tg3 fc8d0000-fc8dffff : 0000:02:09.1 fc8d0000-fc8dffff : tg3 fc8ffc00-fc8ffcff : 0000:02:02.0 fc8ffc00-fc8ffcff : 3w-9xxx fc900000-feafffff : PCI Bus #03 fd000000-fdffffff : 0000:03:06.0 feaa0000-feabffff : 0000:03:08.0 feaa0000-feabffff : e100 feafb000-feafbfff : 0000:03:08.0 feafb000-feafbfff : e100 feafc000-feafcfff : 0000:03:00.0 feafc000-feafcfff : ohci_hcd feafd000-feafdfff : 0000:03:00.1 feafd000-feafdfff : ohci_hcd feafec00-feafefff : 0000:03:05.0 feafec00-feafefff : sata_sil feaff000-feafffff : 0000:03:06.0 febfe000-febfefff : 0000:00:0b.1 febff000-febfffff : 0000:00:0a.1 ff780000-ffffffff : reserved # lspci -vvv 0000:00:06.0 PCI bridge: Advanced Micro Devices [AMD] AMD-8111 PCI (rev 07) (prog-if 00 [Normal decode]) Control: I/O+ Mem+ BusMaster+ SpecCycle- MemWINV+ VGASnoop- ParErr- Stepping- SERR+ FastB2B- Status: Cap+ 66MHz+ UDF- FastB2B- ParErr- DEVSEL=medium >TAbort- SERR- Reset- FastB2B- Capabilities: [c0] #08 [0086] Capabilities: [f0] #08 [8000] 0000:00:07.0 ISA bridge: Advanced Micro Devices [AMD] AMD-8111 LPC (rev 05) Subsystem: Advanced Micro Devices [AMD] AMD-8111 LPC Control: I/O+ Mem+ BusMaster+ SpecCycle+ MemWINV- VGASnoop- ParErr- Stepping- SERR- FastB2B- Status: Cap- 66MHz+ UDF- FastB2B- ParErr- DEVSEL=medium >TAbort- SERR- TAbort- SERR- TAbort- SERR- TAbort- SERR- TAbort- SERR- Reset- FastB2B- Capabilities: [a0] Capabilities: [b8] #08 [8000] Capabilities: [c0] #08 [004a] 0000:00:0a.1 PIC: Advanced Micro Devices [AMD] AMD-8131 PCI-X APIC (rev 01) (prog-if 10 [IO-APIC]) Subsystem: Advanced Micro Devices [AMD]: Unknown device 36c0 Control: I/O- Mem+ BusMaster+ SpecCycle- MemWINV- VGASnoop- ParErr- Stepping- SERR- FastB2B- Status: Cap- 66MHz- UDF- FastB2B- ParErr- DEVSEL=medium >TAbort- SERR- TAbort- SERR- Reset- FastB2B- Capabilities: [a0] Capabilities: [b8] #08 [8000] 0000:00:0b.1 PIC: Advanced Micro Devices [AMD] AMD-8131 PCI-X APIC (rev 01) (prog-if 10 [IO-APIC]) Subsystem: Advanced Micro Devices [AMD]: Unknown device 36c0 Control: I/O- Mem+ BusMaster+ SpecCycle- MemWINV- VGASnoop- ParErr- Stepping- SERR- FastB2B- Status: Cap- 66MHz- UDF- FastB2B- ParErr- DEVSEL=medium >TAbort- SERR- TAbort- SERR- TAbort- SERR- TAbort- SERR- TAbort- SERR- TAbort- SERR- TAbort- SERR- TAbort- SERR- TAbort- SERR- TAbort- SERR- TAbort- SERR- TAbort- SERR- TAbort- SERR- TAbort- SERR- TAbort- SERR- TAbort- SERR- TAbort- SERR- TAbort- SERR-