netdev.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: Leon Romanovsky <leon@kernel.org>
To: Saeed Mahameed <saeedm@nvidia.com>
Cc: Saeed Mahameed <saeed@kernel.org>,
	Jason Gunthorpe <jgg@nvidia.com>,
	Patrisious Haddad <phaddad@nvidia.com>,
	Jakub Kicinski <kuba@kernel.org>,
	linux-rdma@vger.kernel.org, netdev@vger.kernel.org,
	Paolo Abeni <pabeni@redhat.com>,
	Yishai Hadas <yishaih@nvidia.com>
Subject: Re: [PATCH mlx5-next 1/3] net/mlx5: Nullify eq->dbg and qp->dbg pointers post destruction
Date: Sun, 10 Apr 2022 10:58:16 +0300	[thread overview]
Message-ID: <YlKOGMw7vemdPn/a@unreal> (raw)
In-Reply-To: <20220408193035.2uplgfjnfjo4s4f2@sx1>

On Fri, Apr 08, 2022 at 12:30:35PM -0700, Saeed Mahameed wrote:
> On 06 Apr 10:55, Leon Romanovsky wrote:
> > On Tue, Apr 05, 2022 at 12:48:45PM -0700, Saeed Mahameed wrote:
> > > On 05 Apr 11:12, Leon Romanovsky wrote:
> > > > From: Patrisious Haddad <phaddad@nvidia.com>
> > > >
> > > > Prior to this patch in the case that destroy_unmap_eq()
> > > > failed and was called again, it triggered an additional call of
> > > 
> > > Where is it being failed and called again ? this shouldn't even be an
> > > option, we try to keep mlx5 symmetrical, constructors and destructors are
> > > supposed to be called only once in their respective positions.
> > > the callers must be fixed to avoid re-entry, or change destructors to clear
> > > up all resources even on failures, no matter what do not invent a reentry
> > > protocols to mlx5 destructors.
> > 
> > It can happen when QP is exposed through DEVX interface. In that flow,
> > only FW knows about it and reference count all users. This means that
> > attempt to destroy such QP will fail, but mlx5_core is structured in
> > such way that all cleanup was done before calling to FW to get
> > success/fail response.
> 
> I wasn't talking about destroy_qp, actually destroy_qp is implemented the
> way i am asking you to implement destroy_eq(); remove debugfs on first call
> to destroy EQ, and drop the reentry logic from from mlx5_eq_destroy_generic
> and destroy_async_eq.
> 
> EQ is a core/mlx5_ib resources, it's not exposed to user nor DEVX, it
> shouldn't be subject to DEVX limitations.

I tend to agree with you. I'll take another look on it and resubmit.

> 
> Also looking at the destroy_qp implementation, it removes the debugfs
> unconditionally even if the QP has ref count and removal will fail in FW.
> just FYI.

Right, we don't care about debugfs.

> 
> For EQ I don't even understand why devx can cause ODP EQ removal to fail..
> you must fix this at mlx5_ib layer, but for this patch, please drop the
> re-entry and remove debugfs in destroy_eq, unconditionally.

The reason to complexity is not debugfs, but an existence of
"mlx5_frag_buf_free(dev, &eq->frag_buf);" line, after FW command is
executed.

We need to separate to two flows: the one that can tolerate FW cmd failures
and the one that can't. If you don't add "reentry" flag, you can (theoretically)
find yourself leaking ->frag_buf in the flows that don't know how to reentry.

I'll resubmit.

Thanks

  reply	other threads:[~2022-04-10  7:58 UTC|newest]

Thread overview: 8+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2022-04-05  8:12 [PATCH rdma-next 0/3] Handle FW failures to destroy QP/RQ objects Leon Romanovsky
2022-04-05  8:12 ` [PATCH mlx5-next 1/3] net/mlx5: Nullify eq->dbg and qp->dbg pointers post destruction Leon Romanovsky
2022-04-05 19:48   ` Saeed Mahameed
2022-04-06  7:55     ` Leon Romanovsky
2022-04-08 19:30       ` Saeed Mahameed
2022-04-10  7:58         ` Leon Romanovsky [this message]
2022-04-05  8:12 ` [PATCH rdma-next 2/3] RDMA/mlx5: Handling dct common resource destruction upon firmware failure Leon Romanovsky
2022-04-05  8:12 ` [PATCH rdma-next 3/3] RDMA/mlx5: Return the firmware result upon destroying QP/RQ Leon Romanovsky

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=YlKOGMw7vemdPn/a@unreal \
    --to=leon@kernel.org \
    --cc=jgg@nvidia.com \
    --cc=kuba@kernel.org \
    --cc=linux-rdma@vger.kernel.org \
    --cc=netdev@vger.kernel.org \
    --cc=pabeni@redhat.com \
    --cc=phaddad@nvidia.com \
    --cc=saeed@kernel.org \
    --cc=saeedm@nvidia.com \
    --cc=yishaih@nvidia.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).