public inbox for linux-rdma@vger.kernel.org
 help / color / mirror / Atom feed
From: Dongsu Park <dongsu.park-EIkl63zCoXaH+58JC4qpiA@public.gmane.org>
To: "Hudzia, Benoit" <benoit.hudzia-y6kNeMnOB+c@public.gmane.org>
Cc: "linux-rdma-u79uwXL29TY76Z2rM5mHXA@public.gmane.org"
	<linux-rdma-u79uwXL29TY76Z2rM5mHXA@public.gmane.org>
Subject: Re: mlx4 module loading fail
Date: Thu, 7 Mar 2013 13:38:54 +0100	[thread overview]
Message-ID: <20130307123854.GB15491@gmail.com> (raw)
In-Reply-To: <96353B6F8A3DAE4BBC51047BD0E6BAC20913A5-v0w1aZ/WxVLTw0Kyn31wWKuC/IaeJB0jHWlK3eZauXw@public.gmane.org>

Hi,

On 07.03.2013 11:18, Hudzia, Benoit wrote:
> The servers spec are as follow: 
> 	* 4x 10 core Intel(R) Xeon(R) CPU E7- 4870  @ 2.40GHz stepping 02
> 	* 1TB of RAM 
> 	* 1 connectx2 IB 
> 
> Kernel Version : 3.5.0 
> 
> Note if I downgrade to a 3.2 kernel I do not experience this issue. However I am forced to work with a 3.5 or higher. Can somebody help me with that? 

Probably the commit 89dd86db (mlx4_core: Allow large mlx4_buddy bitmaps),
which is already included in 3.6 or higher, has already fixed the problem.

https://git.kernel.org/cgit/linux/kernel/git/stable/linux-stable.git/commit?h=linux-3.6.y&id=89dd86db

Regards,
Dongsu

> Thanks 
> Benoit
> 
> Kernel log trace: 
> 
> Mar  7 03:12:27 bi-heca-02 kernel: [    7.423038] ------------[ cut here ]------------
> Mar  7 03:12:27 bi-heca-02 kernel: [    7.423049] WARNING: at mm/page_alloc.c:2298 __alloc_pages_nodemask+0x2b9/0x810()
> Mar  7 03:12:27 bi-heca-02 kernel: [    7.423050] Hardware name: QSSC-S4R
> Mar  7 03:12:27 bi-heca-02 kernel: [    7.423051] Modules linked in: joydev coretemp kvm_intel kvm microcode pcspkr ixgbe mlx4_core(+) igb mdio ioatdma i2c_i801 hid_generic lpc_ich i2c_core mfd_core dca tpm_tis tpm tpm_bios acpi_memhotpl
> ug evbug crc32c_intel megaraid_sas usbhid hid
> Mar  7 03:12:27 bi-heca-02 kernel: [    7.423078] Pid: 949, comm: modprobe Not tainted 3.5.0-heca-dev-34dd48a+ #29
> Mar  7 03:12:27 bi-heca-02 kernel: [    7.423079] Call Trace:
> Mar  7 03:12:27 bi-heca-02 kernel: [    7.423088]  [<ffffffff8104baef>] warn_slowpath_common+0x7f/0xc0
> Mar  7 03:12:27 bi-heca-02 kernel: [    7.423091]  [<ffffffff8104bb4a>] warn_slowpath_null+0x1a/0x20
> Mar  7 03:12:27 bi-heca-02 kernel: [    7.423093]  [<ffffffff811028b9>] __alloc_pages_nodemask+0x2b9/0x810
> Mar  7 03:12:27 bi-heca-02 kernel: [    7.423096]  [<ffffffff81102785>] ? __alloc_pages_nodemask+0x185/0x810
> Mar  7 03:12:27 bi-heca-02 kernel: [    7.423101]  [<ffffffff81137086>] alloc_pages_current+0xb6/0x120
> Mar  7 03:12:27 bi-heca-02 kernel: [    7.423105]  [<ffffffff810fe02e>] __get_free_pages+0xe/0x40
> Mar  7 03:12:27 bi-heca-02 kernel: [    7.423108]  [<ffffffff8113fcff>] kmalloc_order_trace+0x3f/0xd0
> Mar  7 03:12:27 bi-heca-02 kernel: [    7.423110]  [<ffffffff810fe02e>] ? __get_free_pages+0xe/0x40
> Mar  7 03:12:27 bi-heca-02 kernel: [    7.423113]  [<ffffffff811405e0>] __kmalloc+0x100/0x160
> Mar  7 03:12:27 bi-heca-02 kernel: [    7.423131]  [<ffffffffa01ba35d>] mlx4_buddy_init+0xed/0x1a0 [mlx4_core]
> Mar  7 03:12:27 bi-heca-02 kernel: [    7.423140]  [<ffffffffa01bb8aa>] mlx4_init_mr_table+0xca/0x150 [mlx4_core]
> Mar  7 03:12:27 bi-heca-02 kernel: [    7.423148]  [<ffffffffa01b6fa7>] mlx4_setup_hca.part.12+0xf7/0x4e0 [mlx4_core]
> Mar  7 03:12:27 bi-heca-02 kernel: [    7.423156]  [<ffffffffa01aaeef>] ? mlx4_bitmap_init+0x8f/0xb0 [mlx4_core]
> Mar  7 03:12:27 bi-heca-02 kernel: [    7.423164]  [<ffffffffa01b73bb>] mlx4_setup_hca+0x2b/0x70 [mlx4_core]
> Mar  7 03:12:27 bi-heca-02 kernel: [    7.423172]  [<ffffffffa01b7ba4>] __mlx4_init_one+0x744/0x960 [mlx4_core]
> Mar  7 03:12:27 bi-heca-02 kernel: [    7.423179]  [<ffffffffa01c55b6>] mlx4_init_one+0x3d/0x42 [mlx4_core]
> Mar  7 03:12:27 bi-heca-02 kernel: [    7.423186]  [<ffffffff812e6e56>] pci_call_probe+0x96/0xb0
> Mar  7 03:12:27 bi-heca-02 kernel: [    7.423189]  [<ffffffff812e8019>] pci_device_probe+0x79/0xa0
> Mar  7 03:12:27 bi-heca-02 kernel: [    7.423194]  [<ffffffff813894fa>] ? driver_sysfs_add+0x7a/0xb0
> Mar  7 03:12:27 bi-heca-02 kernel: [    7.423196]  [<ffffffff813896b8>] really_probe+0x68/0x200
> Mar  7 03:12:27 bi-heca-02 kernel: [    7.423198]  [<ffffffff81389982>] driver_probe_device+0x22/0x30
> Mar  7 03:12:27 bi-heca-02 kernel: [    7.423200]  [<ffffffff81389a3b>] __driver_attach+0xab/0xb0
> Mar  7 03:12:27 bi-heca-02 kernel: [    7.423202]  [<ffffffff81389990>] ? driver_probe_device+0x30/0x30
> Mar  7 03:12:27 bi-heca-02 kernel: [    7.423205]  [<ffffffff81387c46>] bus_for_each_dev+0x56/0x90
> Mar  7 03:12:27 bi-heca-02 kernel: [    7.423207]  [<ffffffff813892fe>] driver_attach+0x1e/0x20
> Mar  7 03:12:27 bi-heca-02 kernel: [    7.423210]  [<ffffffff81388ed0>] bus_add_driver+0x1a0/0x270
> Mar  7 03:12:27 bi-heca-02 kernel: [    7.423216]  [<ffffffffa01d2031>] ? mlx4_catas_init+0x31/0x31 [mlx4_core]
> Mar  7 03:12:27 bi-heca-02 kernel: [    7.423218]  [<ffffffff81389f86>] driver_register+0x76/0x130
> Mar  7 03:12:27 bi-heca-02 kernel: [    7.423223]  [<ffffffff8157aa9d>] ? notifier_call_chain+0x4d/0x70
> Mar  7 03:12:27 bi-heca-02 kernel: [    7.423227]  [<ffffffff8109f0b0>] ? add_kallsyms+0x1e0/0x1e0
> Mar  7 03:12:27 bi-heca-02 kernel: [    7.423233]  [<ffffffffa01d2031>] ? mlx4_catas_init+0x31/0x31 [mlx4_core]
> Mar  7 03:12:27 bi-heca-02 kernel: [    7.423235]  [<ffffffff812e7d85>] __pci_register_driver+0x55/0xd0
> Mar  7 03:12:27 bi-heca-02 kernel: [    7.423241]  [<ffffffffa01d2031>] ? mlx4_catas_init+0x31/0x31 [mlx4_core]
> Mar  7 03:12:27 bi-heca-02 kernel: [    7.423246]  [<ffffffffa01d20dd>] mlx4_init+0xac/0xec [mlx4_core]
> Mar  7 03:12:27 bi-heca-02 kernel: [    7.423250]  [<ffffffff8100203f>] do_one_initcall+0x3f/0x170
> Mar  7 03:12:27 bi-heca-02 kernel: [    7.423253]  [<ffffffff810a18bf>] sys_init_module+0x8f/0x200
> Mar  7 03:12:27 bi-heca-02 kernel: [    7.423257]  [<ffffffff8157f0a9>] system_call_fastpath+0x16/0x1b
> Mar  7 03:12:27 bi-heca-02 kernel: [    7.423259] ---[ end trace 8886e8f0c535939d ]---
> Mar  7 03:12:27 bi-heca-02 kernel: [    7.423263] mlx4_core 0000:86:00.0: Failed to initialize memory region table, aborting.
> Mar  7 03:12:27 bi-heca-02 kernel: [    8.431444] mlx4_core: probe of 0000:86:00.0 failed with error -12
> 
> 
> 
> Dr. Benoit Hudzia
> Senior Researcher
> 
> SAP Next Business and Technology 
> SAP (UK) Limited
> The Concourse Building 
> Queen's Road , Queen's Island, Titanic Quarter
> BT3 9TD Belfast
> T +44 (0)28 9078 5742
> F +44 (0)28 9078  5777
> M +44 (0)79 834 46729
> mailto:benoit.hudzia-y6kNeMnOB+c@public.gmane.org
> www.sap.com/research
> 
> --
> To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
> the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html
--
To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

  parent reply	other threads:[~2013-03-07 12:38 UTC|newest]

Thread overview: 11+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2013-03-07 11:18 mlx4 module loading fail Hudzia, Benoit
     [not found] ` <96353B6F8A3DAE4BBC51047BD0E6BAC20913A5-v0w1aZ/WxVLTw0Kyn31wWKuC/IaeJB0jHWlK3eZauXw@public.gmane.org>
2013-03-07 12:38   ` Dongsu Park [this message]
     [not found]     ` <20130307123854.GB15491-Re5JQEeQqe8AvxtiuMwx3w@public.gmane.org>
2013-03-07 12:56       ` Hudzia, Benoit
2013-03-07 15:34   ` Or Gerlitz
     [not found]     ` <5138B372.4020201-VPRAkNaXOzVWk0Htik3J/w@public.gmane.org>
2013-03-07 16:06       ` Hudzia, Benoit
     [not found]         ` <96353B6F8A3DAE4BBC51047BD0E6BAC20914D9-v0w1aZ/WxVLTw0Kyn31wWKuC/IaeJB0jHWlK3eZauXw@public.gmane.org>
2013-03-07 16:22           ` Or Gerlitz
     [not found]             ` <5138BED3.30506-VPRAkNaXOzVWk0Htik3J/w@public.gmane.org>
2013-03-07 16:54               ` Hudzia, Benoit
2013-03-08 13:32           ` Or Gerlitz
     [not found]             ` <CAJZOPZKyZgpf3dqfif3c6WHWhriWic06xsWCkdo2TCars3Aehw-JsoAwUIsXosN+BqQ9rBEUg@public.gmane.org>
2013-03-14 22:53               ` Hudzia, Benoit
     [not found]                 ` <96353B6F8A3DAE4BBC51047BD0E6BAC2094AD2-v0w1aZ/WxVLTw0Kyn31wWKuC/IaeJB0jHWlK3eZauXw@public.gmane.org>
2013-03-17  7:45                   ` Or Gerlitz
     [not found]                     ` <514574AE.9080002-VPRAkNaXOzVWk0Htik3J/w@public.gmane.org>
2013-03-17  8:30                       ` Hudzia, Benoit

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20130307123854.GB15491@gmail.com \
    --to=dongsu.park-eikl63zcoxah+58jc4qpia@public.gmane.org \
    --cc=benoit.hudzia-y6kNeMnOB+c@public.gmane.org \
    --cc=linux-rdma-u79uwXL29TY76Z2rM5mHXA@public.gmane.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox