From mboxrd@z Thu Jan 1 00:00:00 1970 From: Dongsu Park Subject: Re: mlx4 module loading fail Date: Thu, 7 Mar 2013 13:38:54 +0100 Message-ID: <20130307123854.GB15491@gmail.com> References: <96353B6F8A3DAE4BBC51047BD0E6BAC20913A5@DEWDFEMB17A.global.corp.sap> Mime-Version: 1.0 Content-Type: text/plain; charset=utf-8 Return-path: Content-Disposition: inline In-Reply-To: <96353B6F8A3DAE4BBC51047BD0E6BAC20913A5-v0w1aZ/WxVLTw0Kyn31wWKuC/IaeJB0jHWlK3eZauXw@public.gmane.org> Sender: linux-rdma-owner-u79uwXL29TY76Z2rM5mHXA@public.gmane.org To: "Hudzia, Benoit" Cc: "linux-rdma-u79uwXL29TY76Z2rM5mHXA@public.gmane.org" List-Id: linux-rdma@vger.kernel.org Hi, On 07.03.2013 11:18, Hudzia, Benoit wrote: > The servers spec are as follow: > * 4x 10 core Intel(R) Xeon(R) CPU E7- 4870 @ 2.40GHz stepping 02 > * 1TB of RAM > * 1 connectx2 IB > > Kernel Version : 3.5.0 > > Note if I downgrade to a 3.2 kernel I do not experience this issue. However I am forced to work with a 3.5 or higher. Can somebody help me with that? Probably the commit 89dd86db (mlx4_core: Allow large mlx4_buddy bitmaps), which is already included in 3.6 or higher, has already fixed the problem. https://git.kernel.org/cgit/linux/kernel/git/stable/linux-stable.git/commit?h=linux-3.6.y&id=89dd86db Regards, Dongsu > Thanks > Benoit > > Kernel log trace: > > Mar 7 03:12:27 bi-heca-02 kernel: [ 7.423038] ------------[ cut here ]------------ > Mar 7 03:12:27 bi-heca-02 kernel: [ 7.423049] WARNING: at mm/page_alloc.c:2298 __alloc_pages_nodemask+0x2b9/0x810() > Mar 7 03:12:27 bi-heca-02 kernel: [ 7.423050] Hardware name: QSSC-S4R > Mar 7 03:12:27 bi-heca-02 kernel: [ 7.423051] Modules linked in: joydev coretemp kvm_intel kvm microcode pcspkr ixgbe mlx4_core(+) igb mdio ioatdma i2c_i801 hid_generic lpc_ich i2c_core mfd_core dca tpm_tis tpm tpm_bios acpi_memhotpl > ug evbug crc32c_intel megaraid_sas usbhid hid > Mar 7 03:12:27 bi-heca-02 kernel: [ 7.423078] Pid: 949, comm: modprobe Not tainted 3.5.0-heca-dev-34dd48a+ #29 > Mar 7 03:12:27 bi-heca-02 kernel: [ 7.423079] Call Trace: > Mar 7 03:12:27 bi-heca-02 kernel: [ 7.423088] [] warn_slowpath_common+0x7f/0xc0 > Mar 7 03:12:27 bi-heca-02 kernel: [ 7.423091] [] warn_slowpath_null+0x1a/0x20 > Mar 7 03:12:27 bi-heca-02 kernel: [ 7.423093] [] __alloc_pages_nodemask+0x2b9/0x810 > Mar 7 03:12:27 bi-heca-02 kernel: [ 7.423096] [] ? __alloc_pages_nodemask+0x185/0x810 > Mar 7 03:12:27 bi-heca-02 kernel: [ 7.423101] [] alloc_pages_current+0xb6/0x120 > Mar 7 03:12:27 bi-heca-02 kernel: [ 7.423105] [] __get_free_pages+0xe/0x40 > Mar 7 03:12:27 bi-heca-02 kernel: [ 7.423108] [] kmalloc_order_trace+0x3f/0xd0 > Mar 7 03:12:27 bi-heca-02 kernel: [ 7.423110] [] ? __get_free_pages+0xe/0x40 > Mar 7 03:12:27 bi-heca-02 kernel: [ 7.423113] [] __kmalloc+0x100/0x160 > Mar 7 03:12:27 bi-heca-02 kernel: [ 7.423131] [] mlx4_buddy_init+0xed/0x1a0 [mlx4_core] > Mar 7 03:12:27 bi-heca-02 kernel: [ 7.423140] [] mlx4_init_mr_table+0xca/0x150 [mlx4_core] > Mar 7 03:12:27 bi-heca-02 kernel: [ 7.423148] [] mlx4_setup_hca.part.12+0xf7/0x4e0 [mlx4_core] > Mar 7 03:12:27 bi-heca-02 kernel: [ 7.423156] [] ? mlx4_bitmap_init+0x8f/0xb0 [mlx4_core] > Mar 7 03:12:27 bi-heca-02 kernel: [ 7.423164] [] mlx4_setup_hca+0x2b/0x70 [mlx4_core] > Mar 7 03:12:27 bi-heca-02 kernel: [ 7.423172] [] __mlx4_init_one+0x744/0x960 [mlx4_core] > Mar 7 03:12:27 bi-heca-02 kernel: [ 7.423179] [] mlx4_init_one+0x3d/0x42 [mlx4_core] > Mar 7 03:12:27 bi-heca-02 kernel: [ 7.423186] [] pci_call_probe+0x96/0xb0 > Mar 7 03:12:27 bi-heca-02 kernel: [ 7.423189] [] pci_device_probe+0x79/0xa0 > Mar 7 03:12:27 bi-heca-02 kernel: [ 7.423194] [] ? driver_sysfs_add+0x7a/0xb0 > Mar 7 03:12:27 bi-heca-02 kernel: [ 7.423196] [] really_probe+0x68/0x200 > Mar 7 03:12:27 bi-heca-02 kernel: [ 7.423198] [] driver_probe_device+0x22/0x30 > Mar 7 03:12:27 bi-heca-02 kernel: [ 7.423200] [] __driver_attach+0xab/0xb0 > Mar 7 03:12:27 bi-heca-02 kernel: [ 7.423202] [] ? driver_probe_device+0x30/0x30 > Mar 7 03:12:27 bi-heca-02 kernel: [ 7.423205] [] bus_for_each_dev+0x56/0x90 > Mar 7 03:12:27 bi-heca-02 kernel: [ 7.423207] [] driver_attach+0x1e/0x20 > Mar 7 03:12:27 bi-heca-02 kernel: [ 7.423210] [] bus_add_driver+0x1a0/0x270 > Mar 7 03:12:27 bi-heca-02 kernel: [ 7.423216] [] ? mlx4_catas_init+0x31/0x31 [mlx4_core] > Mar 7 03:12:27 bi-heca-02 kernel: [ 7.423218] [] driver_register+0x76/0x130 > Mar 7 03:12:27 bi-heca-02 kernel: [ 7.423223] [] ? notifier_call_chain+0x4d/0x70 > Mar 7 03:12:27 bi-heca-02 kernel: [ 7.423227] [] ? add_kallsyms+0x1e0/0x1e0 > Mar 7 03:12:27 bi-heca-02 kernel: [ 7.423233] [] ? mlx4_catas_init+0x31/0x31 [mlx4_core] > Mar 7 03:12:27 bi-heca-02 kernel: [ 7.423235] [] __pci_register_driver+0x55/0xd0 > Mar 7 03:12:27 bi-heca-02 kernel: [ 7.423241] [] ? mlx4_catas_init+0x31/0x31 [mlx4_core] > Mar 7 03:12:27 bi-heca-02 kernel: [ 7.423246] [] mlx4_init+0xac/0xec [mlx4_core] > Mar 7 03:12:27 bi-heca-02 kernel: [ 7.423250] [] do_one_initcall+0x3f/0x170 > Mar 7 03:12:27 bi-heca-02 kernel: [ 7.423253] [] sys_init_module+0x8f/0x200 > Mar 7 03:12:27 bi-heca-02 kernel: [ 7.423257] [] system_call_fastpath+0x16/0x1b > Mar 7 03:12:27 bi-heca-02 kernel: [ 7.423259] ---[ end trace 8886e8f0c535939d ]--- > Mar 7 03:12:27 bi-heca-02 kernel: [ 7.423263] mlx4_core 0000:86:00.0: Failed to initialize memory region table, aborting. > Mar 7 03:12:27 bi-heca-02 kernel: [ 8.431444] mlx4_core: probe of 0000:86:00.0 failed with error -12 > > > > Dr. Benoit Hudzia > Senior Researcher > > SAP Next Business and Technology > SAP (UK) Limited > The Concourse Building > Queen's Road , Queen's Island, Titanic Quarter > BT3 9TD Belfast > T +44 (0)28 9078 5742 > F +44 (0)28 9078 5777 > M +44 (0)79 834 46729 > mailto:benoit.hudzia-y6kNeMnOB+c@public.gmane.org > www.sap.com/research > > -- > To unsubscribe from this list: send the line "unsubscribe linux-rdma" in > the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org > More majordomo info at http://vger.kernel.org/majordomo-info.html -- To unsubscribe from this list: send the line "unsubscribe linux-rdma" in the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org More majordomo info at http://vger.kernel.org/majordomo-info.html