From mboxrd@z Thu Jan 1 00:00:00 1970 From: Jesper Dangaard Brouer Subject: Re: mlx5 driver loading failing on v4.19 / net-next / bpf-next Date: Fri, 14 Sep 2018 08:36:18 +0200 Message-ID: <20180914083618.08fe816e@redhat.com> References: Mime-Version: 1.0 Content-Type: text/plain; charset=US-ASCII Content-Transfer-Encoding: 7bit Cc: Tariq Toukan , Saeed Mahameed , "netdev@vger.kernel.org" , Eran Ben Elisha , brouer@redhat.com To: Alexei Starovoitov , Moshe Shemesh , Eli Cohen , Or Gerlitz Return-path: Received: from mx1.redhat.com ([209.132.183.28]:46640 "EHLO mx1.redhat.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1726831AbeINLt0 (ORCPT ); Fri, 14 Sep 2018 07:49:26 -0400 In-Reply-To: Sender: netdev-owner@vger.kernel.org List-ID: On Thu, 13 Sep 2018 15:55:29 -0700 Alexei Starovoitov wrote: > On Thu, Aug 30, 2018 at 1:35 AM, Tariq Toukan wrote: > > > > > > On 29/08/2018 6:05 PM, Jesper Dangaard Brouer wrote: > >> > >> Hi Saeed, > >> > >> I'm having issues loading mlx5 driver on v4.19 kernels (tested both > >> net-next and bpf-next), while kernel v4.18 seems to work. It happens > >> with a Mellanox ConnectX-5 NIC (and also a CX4-Lx but I removed that > >> from the system now). > >> > > > > Hi Jesper, > > > > Thanks for your report! > > > > We are working to analyze and debug the issue. > > looks like serious issue to me... while no news in 2 weeks. > any update? Mellanox took it offlist, and Sep 6th found that this is a regression introduced by commit 269d26f47f6f ("net/mlx5: Reduce command polling interval"), but only if CONFIG_PREEMPT is on. I can confirm that reverting this commit fixed the issue (and not the firmware upgrade I also did). I think Moshe (Cc) is responsible for this case, and I expect to soon see a revert or alternative solution to this!? Thanks for the kick Alexei :-) -- Best regards, Jesper Dangaard Brouer MSc.CS, Principal Kernel Engineer at Red Hat LinkedIn: http://www.linkedin.com/in/brouer