public inbox for linux-rdma@vger.kernel.org
 help / color / mirror / Atom feed
* [BUG] mellanox IB driver fails to load on large config
@ 2015-07-10 19:15 andrew banman
       [not found] ` <20150710191506.GA52396-UiYq4lDBhggTG1waqwXmH7Cf4lofQVJ7@public.gmane.org>
  0 siblings, 1 reply; 11+ messages in thread
From: andrew banman @ 2015-07-10 19:15 UTC (permalink / raw)
  To: linux-kernel-u79uwXL29TY76Z2rM5mHXA
  Cc: Doug Ledford, Sean Hefty, Hal Rosenstock, Or Gerlitz,
	David S. Miller, Roland Dreier, Matan Barak, Moni Shoua,
	Jack Morgenstein, Yishai Hadas, Eran Ben Elisha, Ira Weiny,
	linux-rdma-u79uwXL29TY76Z2rM5mHXA

I'm seeing a large number of allocation errors originating from the Mellanox IB
driver when booting the 4.2-rc1 kernel on a 4096cpu 32TB memory system:

8<---
<mlx4_ib> mlx4_ib_alloc_eqs: Can't allocate EQ 64; reverting to legacy
<mlx4_ib> mlx4_ib_alloc_eqs: Can't allocate EQ 65; reverting to legacy
<mlx4_ib> mlx4_ib_alloc_eqs: Can't allocate EQ 66; reverting to legacy
<mlx4_ib> mlx4_ib_alloc_eqs: Can't allocate EQ 67; reverting to legacy
<mlx4_ib> mlx4_ib_alloc_eqs: Can't allocate EQ 68; reverting to legacy
<mlx4_ib> mlx4_ib_alloc_eqs: Can't allocate EQ 69; reverting to legacy
<mlx4_ib> mlx4_ib_alloc_eqs: Can't allocate EQ 70; reverting to legacy
<mlx4_ib> mlx4_ib_alloc_eqs: Can't allocate EQ 71; reverting to legacy
......
<mlx4_ib> mlx4_ib_alloc_eqs: Can't allocate EQ 123; reverting to legacy
--->8

Where the failing function is in drivers/infiniband/hw/mlx4/main.c:

8<---
2042 static void mlx4_ib_alloc_eqs(struct mlx4_dev *dev, struct mlx4_ib_dev *ibdev)
...
2075                         /* Set IRQ for specific name (per ring) */
2076                         if (mlx4_assign_eq(dev, name, NULL,
2077                                            &ibdev->eq_table[eq])) {
2078                                 /* Use legacy (same as mlx4_en driver) */
2079                                 pr_warn("Can't allocate EQ %d; reverting to legacy\n", eq);
2080                                 ibdev->eq_table[eq] =
2081                                         (eq % dev->caps.num_comp_vectors);
2082                         }
--->8

The problem doesn't appear to be fatal. At this point I am unsure if this is
actually expected behavior, so I'm looking for some insight into the issue.

At first we believed the problem to be with request_irq, but after writing in
some debug code that mlx4_assign_eq returned -28, indicating that vec was
never assigned:

8<---
@@ -1401,6 +1402,7 @@ int mlx4_assign_eq(struct mlx4_dev *dev, char *name, struct cpu_rmap *rmap,
        if (vec) {
                *vector = vec;
        } else {
+               pr_crit("!!! debug: mlx4_assign_eq - last err %d\n", err);
                *vector = 0;
                err = (i == dev->caps.comp_pool) ? -ENOSPC : err;
        }
--->8

8<---
 [ 1565.416273] !!! debug: mlx4_assign_eq - last err 0
 [ 1565.416275] <mlx4_ib> mlx4_ib_alloc_eqs: !!! debug: mlx4_assign_eq returned -28
 [ 1565.416277] <mlx4_ib> mlx4_ib_alloc_eqs: Can't allocate EQ 64; reverting to legacy
--->8


Any help would be greatly appreciated!

Andrew Banman

--
To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply	[flat|nested] 11+ messages in thread

end of thread, other threads:[~2015-07-21 14:21 UTC | newest]

Thread overview: 11+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2015-07-10 19:15 [BUG] mellanox IB driver fails to load on large config andrew banman
     [not found] ` <20150710191506.GA52396-UiYq4lDBhggTG1waqwXmH7Cf4lofQVJ7@public.gmane.org>
2015-07-11 20:20   ` Or Gerlitz
     [not found]     ` <CAJ3xEMj82OtHcs+kDG_xrQZ4+x3Eih=YSNFR4+W-AoPNA=4jTw-JsoAwUIsXosN+BqQ9rBEUg@public.gmane.org>
2015-07-14 18:22       ` andrew banman
     [not found]         ` <20150714182234.GD17920-UiYq4lDBhggTG1waqwXmH7Cf4lofQVJ7@public.gmane.org>
2015-07-14 18:48           ` Alex Thorlton
2015-07-14 20:06             ` Or Gerlitz
     [not found]               ` <CAJ3xEMig=GzROAea7qe0YO11DM6pc4wXyG8VB=kq5Hf5uw2ZXQ-JsoAwUIsXosN+BqQ9rBEUg@public.gmane.org>
2015-07-14 20:28                 ` Alex Thorlton
     [not found]                   ` <20150714202848.GD58053-UiYq4lDBhggTG1waqwXmH7Cf4lofQVJ7@public.gmane.org>
2015-07-15 11:33                     ` Matan Barak
2015-07-16  6:25                     ` Or Gerlitz
     [not found]                       ` <55A74E61.1080403-VPRAkNaXOzVWk0Htik3J/w@public.gmane.org>
2015-07-20 16:28                         ` Alex Thorlton
     [not found]                           ` <20150720162803.GL58053-UiYq4lDBhggTG1waqwXmH7Cf4lofQVJ7@public.gmane.org>
2015-07-21  2:56                             ` Alex Thorlton
     [not found]                               ` <20150721025639.GX58053-UiYq4lDBhggTG1waqwXmH7Cf4lofQVJ7@public.gmane.org>
2015-07-21 14:21                                 ` Matan Barak

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox