Linux RDMA and InfiniBand development
 help / color / mirror / Atom feed
From: Leon Romanovsky <leon@kernel.org>
To: "Saleem, Shiraz" <shiraz.saleem@intel.com>
Cc: Jason Gunthorpe <jgg@nvidia.com>,
	Gal Pressman <galpress@amazon.com>,
	Doug Ledford <dledford@redhat.com>,
	Adit Ranadive <aditr@vmware.com>,
	Ariel Elior <aelior@marvell.com>,
	Bernard Metzler <bmt@zurich.ibm.com>,
	Christian Benvenuti <benve@cisco.com>,
	"Dalessandro, Dennis" <dennis.dalessandro@intel.com>,
	Devesh Sharma <devesh.sharma@broadcom.com>,
	"Latif, Faisal" <faisal.latif@intel.com>,
	Lijun Ou <oulijun@huawei.com>,
	"linux-rdma@vger.kernel.org" <linux-rdma@vger.kernel.org>,
	Michal Kalderon <mkalderon@marvell.com>,
	"Marciniszyn, Mike" <mike.marciniszyn@intel.com>,
	Naresh Kumar PBS <nareshkumar.pbs@broadcom.com>,
	Nelson Escobar <neescoba@cisco.com>,
	Parvi Kaustubhi <pkaustub@cisco.com>,
	Potnuri Bharat Teja <bharat@chelsio.com>,
	Selvin Xavier <selvin.xavier@broadcom.com>,
	Somnath Kotur <somnath.kotur@broadcom.com>,
	Sriharsha Basavapatna <sriharsha.basavapatna@broadcom.com>,
	VMware PV-Drivers <pv-drivers@vmware.com>,
	Weihang Li <liweihang@huawei.com>,
	"Wei Hu(Xavier)" <huwei87@hisilicon.com>,
	Yishai Hadas <yishaih@nvidia.com>,
	Zhu Yanjun <yanjunz@nvidia.com>
Subject: Re: [PATCH rdma-next 01/10] RDMA: Restore ability to fail on PD deallocate
Date: Thu, 27 Aug 2020 09:56:43 +0300	[thread overview]
Message-ID: <20200827065643.GP1362631@unreal> (raw)
In-Reply-To: <9DD61F30A802C4429A01CA4200E302A7010712C8EC@ORSMSX101.amr.corp.intel.com>

On Thu, Aug 27, 2020 at 02:06:03AM +0000, Saleem, Shiraz wrote:
> > Subject: Re: [PATCH rdma-next 01/10] RDMA: Restore ability to fail on PD
> > deallocate
> >
> > On Wed, Aug 26, 2020 at 12:49:03AM +0000, Saleem, Shiraz wrote:
> >
> > > The API is quite confusing now. If drivers are not expected to fail
> > > the destroy and there is no way to propagate the device failures, then
> > > the return type should be a void.
> >
> > More or less, drivers can only return -EAGAIN with the idea that a future call during
> > the close process will eventually succeed.
> >
> > Any permanent failure will trigger WARN_ON and a memory leak
> >
> > Maybe we should switch the return code to bool or something to be a little clearer
> > that it is request to retry, not a failure?
> >
> There is no retry for kernel object destroy right? So not sure bool is making It clearer.

Right, kernel verbs users don't know how to deal with destroy failure
and designed to ensure that destroy always success.

>
> I am not very familiar with devx flows but doesn’t it bypass the ib verbs layer altogether?
> i.e. mlx5_ib_dealloc_pd isn’t directly called in the devx flows no? So why changes its return
> type and other provider destroy callbacks?

DevX itself indeed bypasses ib_core and as a standalone feature doesn't
need any changes in destroys. The problem arises when ib_core object is
created with ibv_create_XXX() and forwarded later to DevX context.

FW counts DevX accesses and elevates internal reference counters to
ensure that user will get proper error if he tries to destroy in-use
resources.

This error is returned to mlx5_ib_dealloc_pd() too if DevX is not
cleaned. This call can be executed by user anytime, for example if he
decided to skip DevX cleanup and the ib_core/mlx5_ib can't prevent call
to mlx5_ib_dealloc_pd() at this stage.

The difference between mlx5 device from other providers that HW/FW
guarantees full cleanup during file close.

>
> But lets go down the path that we really need a return code in the destroy APIs to solve this problem.
> For one I don’t see how we can say its meant exclusively for devx drivers to use for a fail.
> Also can we really claim the API contract is that driver can fail a destroy given a future destroy will succeed?
> Since the kernel destroy has no retry.

This is why we have special calls for kernel users with WARN_ON() and
forced cleanup of ib_core resources.

>
> Which then boils down do we just keep a simpler definition of the API contract -- driver can just return whatever the true error code is?
> i.e. if it wants a retry, use EAGAIN. If it has a non recoverable device error, then reset the device, clean up the resources but return ENOTRECOVERABLE.
> ib_core can enable the retry logic for EAGAIN _only_.  For other error codes, Ib_core can trigger a warn_on or something to indicate permanent failure.

We can, but drivers should implement this EAGAIN/ENOTRECOVERABLE logic,
this is why in initial phase we are returning always success.

> It can also pass on ret_code to user-space as its doing today?
>
> Shiraz
>
>
>
>

  reply	other threads:[~2020-08-27  6:56 UTC|newest]

Thread overview: 33+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2020-08-24 10:32 [PATCH rdma-next 00/10] Restore failure of destroy commands Leon Romanovsky
2020-08-24 10:32 ` [PATCH rdma-next 01/10] RDMA: Restore ability to fail on PD deallocate Leon Romanovsky
2020-08-25  8:13   ` Gal Pressman
2020-08-25  8:38     ` Leon Romanovsky
2020-08-25 11:52     ` Jason Gunthorpe
2020-08-25 12:12       ` Gal Pressman
2020-08-25 12:34         ` Leon Romanovsky
2020-08-25 13:07         ` Jason Gunthorpe
2020-08-25 13:32           ` Gal Pressman
2020-08-25 13:44             ` Jason Gunthorpe
2020-08-25 13:50               ` Jason Gunthorpe
2020-08-25 14:04               ` Gal Pressman
2020-08-25 14:32                 ` Jason Gunthorpe
2020-08-26  0:49                 ` Saleem, Shiraz
2020-08-26  6:34                   ` Leon Romanovsky
2020-08-26 11:40                   ` Jason Gunthorpe
2020-08-27  2:06                     ` Saleem, Shiraz
2020-08-27  6:56                       ` Leon Romanovsky [this message]
2020-08-27 23:30                         ` Saleem, Shiraz
2020-08-27 12:13                       ` Jason Gunthorpe
2020-08-27 23:29                         ` Saleem, Shiraz
2020-08-28 11:25                           ` Jason Gunthorpe
2020-08-24 10:32 ` [PATCH rdma-next 02/10] RDMA: Restore ability to fail on AH destroy Leon Romanovsky
2020-08-25  8:13   ` Gal Pressman
2020-08-25  8:32     ` Leon Romanovsky
2020-08-24 10:32 ` [PATCH rdma-next 03/10] RDMA/mlx5: Issue FW command to destroy SRQ on reentry Leon Romanovsky
2020-08-24 10:32 ` [PATCH rdma-next 04/10] RDMA/mlx5: Fix potential race between destroy and CQE poll Leon Romanovsky
2020-08-24 10:32 ` [PATCH rdma-next 05/10] RDMA: Restore ability to fail on SRQ destroy Leon Romanovsky
2020-08-24 10:32 ` [PATCH rdma-next 06/10] RDMA/core: Delete function indirection for alloc/free kernel CQ Leon Romanovsky
2020-08-24 10:32 ` [PATCH rdma-next 07/10] RDMA: Allow fail of destroy CQ Leon Romanovsky
2020-08-24 10:32 ` [PATCH rdma-next 08/10] RDMA: Change XRCD destroy return value Leon Romanovsky
2020-08-24 10:32 ` [PATCH rdma-next 09/10] RDMA: Restore ability to return error for destroy WQ Leon Romanovsky
2020-08-24 10:32 ` [PATCH rdma-next 10/10] RDMA: Make counters destroy symmetrical Leon Romanovsky

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20200827065643.GP1362631@unreal \
    --to=leon@kernel.org \
    --cc=aditr@vmware.com \
    --cc=aelior@marvell.com \
    --cc=benve@cisco.com \
    --cc=bharat@chelsio.com \
    --cc=bmt@zurich.ibm.com \
    --cc=dennis.dalessandro@intel.com \
    --cc=devesh.sharma@broadcom.com \
    --cc=dledford@redhat.com \
    --cc=faisal.latif@intel.com \
    --cc=galpress@amazon.com \
    --cc=huwei87@hisilicon.com \
    --cc=jgg@nvidia.com \
    --cc=linux-rdma@vger.kernel.org \
    --cc=liweihang@huawei.com \
    --cc=mike.marciniszyn@intel.com \
    --cc=mkalderon@marvell.com \
    --cc=nareshkumar.pbs@broadcom.com \
    --cc=neescoba@cisco.com \
    --cc=oulijun@huawei.com \
    --cc=pkaustub@cisco.com \
    --cc=pv-drivers@vmware.com \
    --cc=selvin.xavier@broadcom.com \
    --cc=shiraz.saleem@intel.com \
    --cc=somnath.kotur@broadcom.com \
    --cc=sriharsha.basavapatna@broadcom.com \
    --cc=yanjunz@nvidia.com \
    --cc=yishaih@nvidia.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox