From mboxrd@z Thu Jan 1 00:00:00 1970 From: Matan Barak Subject: Re: [BUG] mellanox IB driver fails to load on large config Date: Wed, 15 Jul 2015 14:33:33 +0300 Message-ID: <55A6450D.80300@mellanox.com> References: <20150710191506.GA52396@asylum.americas.sgi.com> <20150714182234.GD17920@asylum.americas.sgi.com> <20150714184820.GB58053@asylum.americas.sgi.com> <20150714202848.GD58053@asylum.americas.sgi.com> Mime-Version: 1.0 Content-Type: text/plain; charset="windows-1252"; format=flowed Content-Transfer-Encoding: 7bit Return-path: In-Reply-To: <20150714202848.GD58053-UiYq4lDBhggTG1waqwXmH7Cf4lofQVJ7@public.gmane.org> Sender: linux-rdma-owner-u79uwXL29TY76Z2rM5mHXA@public.gmane.org To: Alex Thorlton , Or Gerlitz Cc: andrew banman , Linux Kernel , Doug Ledford , Sean Hefty , Hal Rosenstock , Or Gerlitz , "David S. Miller" , Roland Dreier , Moni Shoua , Jack Morgenstein , Yishai Hadas , Eran Ben Elisha , Ira Weiny , "linux-rdma-u79uwXL29TY76Z2rM5mHXA@public.gmane.org" List-Id: linux-rdma@vger.kernel.org On 7/14/2015 11:28 PM, Alex Thorlton wrote: > On Tue, Jul 14, 2015 at 11:06:26PM +0300, Or Gerlitz wrote: >> On Tue, Jul 14, 2015 at 9:48 PM, Alex Thorlton wrote: >>> On Tue, Jul 14, 2015 at 01:22:34PM -0500, andrew banman wrote: >>>> On Sat, Jul 11, 2015 at 11:20:19PM +0300, Or Gerlitz wrote: >>>>> On Fri, Jul 10, 2015 at 10:15 PM, andrew banman wrote: >>>>>> I'm seeing a large number of allocation errors originating from the Mellanox IB >>>>>> driver when booting the 4.2-rc1 kernel on a 4096cpu 32TB memory system: >>>>> >>>>> Just to make sure, mlx4 works fine on this small (...) system with 4.1 >>>>> and 4.2-rc1 breaks, or 4.2-rc1 is the 1st time you're trying that >>>>> config? >>>> >>>> I'll let Alex comment on that, he did some testing on that. >>> >>> I started seeing this on a 4.1-rc8 kernel, so it's been around for a >>> little while. It may have been around before 4.1-rc8, but I haven't run >>> any kernels older than that on the big machine for some time. >> >> To make sure I am correctly following, on 4.1-rc8 you also see >> something, right? > > Yes, that's correct. > >> are these the same messages or different ones? if the latter send to us. > > We see the same exact messages on 4.1-rc8. Hi, We don't recall getting those error with 32cpu machines, but we'll try to reproduce this issue. Matan > > Thanks for looking into this! > > - Alex > -- To unsubscribe from this list: send the line "unsubscribe linux-rdma" in the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org More majordomo info at http://vger.kernel.org/majordomo-info.html