All of lore.kernel.org
 help / color / mirror / Atom feed
From: swise@opengridcomputing.com (Steve Wise)
Subject: nvme_rdma - leaves provider resources allocated
Date: Wed, 24 Aug 2016 09:09:48 -0500	[thread overview]
Message-ID: <006501d1fe11$2c3e6200$84bb2600$@opengridcomputing.com> (raw)
In-Reply-To: <98396d58-4a16-0e1f-e42b-912edb8a7cf6@grimberg.me>

> 
> > Assume an nvme_rdma host has one attached controller in RECONNECTING state,
> and
> > that controller has failed to reconnect at least once and thus is in the
> > delay_schedule time before retrying the connection.  At that moment, there
are
> > no cm_ids allocated for that controller because the admin queue and the io
> > queues have been freed.  So nvme_rdma cannot get a DEVICE_REMOVAL from
> the
> > rdma_cm.  This means if the underlying provider module is removed, it will
be
> > removed with resources still allocated by nvme_rdma.  For iw_cxgb4, this
causes
> > a BUG_ON() in gen_pool_destroy() because MRs are still allocated for the
> > controller.
> >
> > Thoughts on how to fix this?
> 
> Hey Steve,
> 
> I think it's time to go back to your client register proposal.
> 
> I can't think of any way to get it right at the moment...
> 
> Maybe if we can make it only do something meaningful in remove_one()
> to handle device removal we can get away with it...

Hey Sagi, 

I'm finalizing a WIP series that provides a different approach.  (we can
certainly reconsider my ib_client patch too).  But my WIP adds the concept of an
"unplug" cm_id for each nvme_rdma_ctrl controller.  When the controller is first
created and the admin qp is connected to the target, the unplug_cm_id is created
and address resolution is done on it to bind it to the same device that the
admin QP is bound to.   This unplug_cm_id remains across any/all kato recovery
and thus will always be available for DEVICE_REMOVAL events.  This simplifies
the unplug handler because the cm_id isn't associated with any of the IO queues
nor the admin queue.  

I also found another bug:  if the reconnect worker times out waiting for rdma
connection setup on an IO or admin QP, a QP is leaked.   I'm looking into this
as well.

Do you have any thoughts on the controller reference around deletion issue I
posted?  

http://lists.infradead.org/pipermail/linux-nvme/2016-August/005919.html

Thanks!

Steve.

  reply	other threads:[~2016-08-24 14:09 UTC|newest]

Thread overview: 6+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2016-08-23 16:58 nvme_rdma - leaves provider resources allocated Steve Wise
2016-08-24  9:31 ` Sagi Grimberg
2016-08-24 14:09   ` Steve Wise [this message]
2016-08-25 21:52     ` Sagi Grimberg
2016-08-25 22:03       ` Steve Wise
2016-08-25 22:06         ` Sagi Grimberg

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to='006501d1fe11$2c3e6200$84bb2600$@opengridcomputing.com' \
    --to=swise@opengridcomputing.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.