Linux NFS development
 help / color / mirror / Atom feed
From: Dan Aloni <dan.aloni@vastdata.com>
To: Chuck Lever <chuck.lever@oracle.com>
Cc: linux-nfs@vger.kernel.org, linux-rdma@vger.kernel.org
Subject: Re: [PATCH] rpcrdma: don't decref EP if a ESTABLISHED did not happen
Date: Sun, 5 May 2024 23:10:35 +0300	[thread overview]
Message-ID: <20240505201035.eya22s7spmejnpdf@gmail.com> (raw)
In-Reply-To: <ZjfXOiQxyGNZpHOj@tissot.1015granger.net>

On 2024-05-05 15:00:10, Chuck Lever wrote:
> On Sun, May 05, 2024 at 09:36:28PM +0300, Dan Aloni wrote:
> > On 2024-05-05 10:35:46, Chuck Lever wrote:
> > [..]
> > > Second, I wonder if, when bonding interfaces, there's an opportunity
> > > for the verbs consumer to take an additional transport reference.
> > > Cc'ing linux-rdma for input on that issue. Can you show a brief
> > > diagram of when the ep reference count changes when bonding?
> > 
> > Not sure we need an additional reference here.
> 
> We already have two mechanisms in play:
> 
> - the ep reference count
> - the re_connect_status value
> 
> I would prefer not to add a boolean here. Seems like
> re_connect_status could do that job if we need it to.

That's possible. So before overrides e.g. `ep->re_connect_status =
-ENODEV` we would need to remember whether it was `1` and across the
call to `rpcrdma_force_disconnect()`, to do the putref() after it.
Anyway, the patch will be a big larger I guess.

>[..]
> > Testing 6.8.9 with both the patch and `wake_up_all(&ep->re_connect_wait);`
> > for `RDMA_CM_EVENT_DEVICE_REMOVAL` works for me, showing proper recovery
> > on bonding, I'll post in a reply.
> 
> It looks like you are trying to fix several issues in one patch. So
> I need you to separate these issues and the fixes, and let's focus
> on the upstream kernel (v6.9-rc6) because there's nothing I can do
> about the RHEL9 kernel, which is clearly a different source base
> than the one I work on.
> 
> If we need a "wake_up_all(&ep->re_connect_wait);" during
> DEVICE_REMOVAL, that should be a separate patch. And you need to
> figure out if ADDR_CHANGE needs the same treatment: the v2 you just
> sent changes the behavior of ADDR_CHANGE too but does not mention
> whether it intended that change.

Was it just the `rpcrdma_ep_put` change in this case? Will feel less
comfortable changing it without a repro of the `ADDR_CHANGE` case. Going
to isolate it.

> Without a reproducer or a detailed explanation of how the ep
> reference count changes in step with CM events, I can't properly
> assess your proposed fix.

With 6.8 I only see a single `RDMA_CM_EVENT_DEVICE_REMOVAL` event
arriving, driving a single `rpcrdma_ep_put` call to put the reference
back at 1, and it waits forever. Getting late here, I'll check on 6.9-rc
tomorrow.

Maybe for a repro, you can generate a single interface bond? I haven't
tried, but possibly it would occur on a single one too if the underlying
port is down.

-- 
Dan Aloni

  reply	other threads:[~2024-05-05 20:10 UTC|newest]

Thread overview: 7+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2024-05-05 12:49 [PATCH] rpcrdma: don't decref EP if a ESTABLISHED did not happen Dan Aloni
2024-05-05 14:35 ` Chuck Lever
2024-05-05 18:36   ` Dan Aloni
2024-05-05 18:38     ` [PATCH] rpcrdma: decref EP only if ESTABLISHED and handle DEVICE_REMOVAL Dan Aloni
2024-05-05 19:00     ` [PATCH] rpcrdma: don't decref EP if a ESTABLISHED did not happen Chuck Lever
2024-05-05 20:10       ` Dan Aloni [this message]
2024-05-12 17:51 ` kernel test robot

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20240505201035.eya22s7spmejnpdf@gmail.com \
    --to=dan.aloni@vastdata.com \
    --cc=chuck.lever@oracle.com \
    --cc=linux-nfs@vger.kernel.org \
    --cc=linux-rdma@vger.kernel.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox