public inbox for linux-kernel@vger.kernel.org
 help / color / mirror / Atom feed
From: Or Gerlitz <ogerlitz@mellanox.com>
To: Roland Dreier <roland@kernel.org>, Jiri Kosina <jkosina@suse.cz>
Cc: Sagi Grimberg <sagig@mellanox.com>,
	Amir Vadai <amirv@mellanox.com>,
	Eli Cohen <eli@dev.mellanox.co.il>,
	Eugenia Emantayev <eugenia@mellanox.com>,
	"David S. Miller" <davem@davemloft.net>,
	Mel Gorman <mgorman@suse.de>,
	"netdev@vger.kernel.org" <netdev@vger.kernel.org>,
	linux-kernel <linux-kernel@vger.kernel.org>,
	Saeed Mahameed <saeedm@mellanox.com>,
	Shlomo Pongratz <shlomop@mellanox.com>
Subject: Re: [PATCH] mlx4: Use GFP_NOFS calls during the ipoib TX path when creating the QP
Date: Tue, 11 Mar 2014 15:53:20 +0200	[thread overview]
Message-ID: <531F1550.3080401@mellanox.com> (raw)
In-Reply-To: <CAG4TOxMw4siOqtjXrgUDaw8yWqYgSG4d8PqGuJ62AZBrC2O2AA@mail.gmail.com>

On 05/03/2014 21:25, Roland Dreier wrote:
> It's quite clear that this is a general problem with IPoIB connected
> mode on any IB device.  In connected mode, a packet send can trigger
> establishing a new connection, which will allocate a new QP, which in
> particular will allocate memory for the QP in the low-level IB device
> driver.  Currently I'm positive that every driver will do GFP_KERNEL
> allocations when allocating a QP (ehca does both a GFP_KERNEL
> kmem_cache allocation and vmalloc in internal_create_qp(), mlx5 and
> mthca are similar to mlx4 and qib does vmalloc() in qib_create_qp()).
> So this patch needs to be extended to the other 4 IB device drivers in
> the tree.
>
> Also, I don't think GFP_NOFS is enough -- it seems we need GFP_NOIO,
> since we could be swapping to a block device over iSCSI over IPoIB-CM,
> so even non-FS stuff could deadlock.
>
> I don't think it makes any sense to have a "do_not_deadlock" module
> parameter, especially one that defaults to "false."  If this is the
> right thing to do, then we should just unconditionally do it.
>
> It does seem that only using GFP_NOIO when we really need to would be
> a very difficult problem--how can we carry information about whether a
> particular packet is involved in freeing memory through all the layers
> of, say, NFS, TCP, IPSEC, bonding, &c?

Agree with all the above... next,

If we don't have away to nicely overcome the layer violations here, 
let's change IPoIB so they always ask the
IB driver to allocate QPs used for Connected Mode in a GFP_NOIO manner, 
to be practical I suggest the following:

1. Add new QP creation flag IB_QP_CREATE_USE_GFP to the existing 
creation flags of struct ib_qp_init_attr
and a new "gfp_t gfp" field to that structure too

2. in the IPoIB CM code, do the vzalloc allocation for new connection in 
GFP_NOIO manner and issue
the call to create QP with setting the IB_QP_CREATE_USE_GFP flag and 
GFO_NOIO to the gfp field

3. If the QP creation fails, with -EINVAL, issue a warning and retry the 
QP creation attempt without the GFP setting

4. implement in the mlx4 driver the support for GFP directives on QP 
creation

5. for the rest of the IB drivers, return -EINVAL if 
IB_QP_CREATE_USE_GFP is set

This will allow to provide working solution for mlx4 users and gradually 
add support for the rest of the IB drivers.

as for proper patch planning

patch #1 / items 1 and 5
patch #2 / item 4
patch #3 / item 3

Re item 5 -- I made a check and the ehca, ipath and mthca driver already 
return -EINVAL if provided with any creation flag, so you only need to 
patch the qib driver in qib_create_qp() to do that as well which is trivial.

As for the rest of the code, you practically have it all by now, just 
need to port the mlx4 changes you did to the suggested framework, remove 
the module param (which you don't like either) and add the new gfp_t 
field to ib_qp_init_attr

So sounds like a plan that makes sense?

Or.


  reply	other threads:[~2014-03-11 13:53 UTC|newest]

Thread overview: 22+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2014-02-26 21:18 [PATCH] mlx4: Use GFP_NOFS calls during the ipoib TX path when creating the QP Or Gerlitz
2014-02-27  9:48 ` Jiri Kosina
2014-02-27  9:58   ` Or Gerlitz
2014-02-27 10:42     ` Jiri Kosina
2014-03-04 22:48       ` Jiri Kosina
2014-03-05 15:57         ` Or Gerlitz
2014-03-05 19:25       ` Roland Dreier
2014-03-11 13:53         ` Or Gerlitz [this message]
2014-03-14 19:50           ` Jiri Kosina
2014-04-24 17:03           ` Jiri Kosina
2014-04-24 20:01             ` Or Gerlitz
2014-05-02 13:03               ` Jiri Kosina
  -- strict thread matches above, loose matches on Subject: below --
2014-02-21 21:53 Jiri Kosina
     [not found] ` <CAJZOPZK4Ah+nKPWnX3=yM43jbf586GYJ+fh0-OL4bOnqKK8v8A@mail.gmail.com>
2014-02-25 21:52   ` Or Gerlitz
2014-02-25 22:11   ` Jiri Kosina
2014-02-25 22:20     ` Or Gerlitz
2014-02-25 22:40       ` Jiri Kosina
2014-02-25 22:48         ` Or Gerlitz
2014-02-25 22:55           ` Jiri Kosina
2014-03-05 19:46     ` Or Gerlitz
2014-03-06 13:31 ` Or Gerlitz
2014-03-06 13:47   ` Jiri Kosina

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=531F1550.3080401@mellanox.com \
    --to=ogerlitz@mellanox.com \
    --cc=amirv@mellanox.com \
    --cc=davem@davemloft.net \
    --cc=eli@dev.mellanox.co.il \
    --cc=eugenia@mellanox.com \
    --cc=jkosina@suse.cz \
    --cc=linux-kernel@vger.kernel.org \
    --cc=mgorman@suse.de \
    --cc=netdev@vger.kernel.org \
    --cc=roland@kernel.org \
    --cc=saeedm@mellanox.com \
    --cc=sagig@mellanox.com \
    --cc=shlomop@mellanox.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox