From: Leon Romanovsky <leon@kernel.org>
To: Sagi Grimberg <sagi@grimberg.me>
Cc: Patrisious Haddad <phaddad@nvidia.com>,
Christoph Hellwig <hch@lst.de>,
Israel Rukshin <israelr@nvidia.com>,
Linux-nvme <linux-nvme@lists.infradead.org>,
linux-rdma@vger.kernel.org,
Michael Guralnik <michaelgur@nvidia.com>,
Maor Gottlieb <maorg@nvidia.com>,
Max Gurtovoy <mgurtovoy@nvidia.com>
Subject: Re: [PATCH rdma-next 4/4] nvme-rdma: add more error details when a QP moves to an error state
Date: Wed, 7 Sep 2022 20:29:57 +0300 [thread overview]
Message-ID: <YxjVFZYMLh19vwNR@unreal> (raw)
In-Reply-To: <ac268c86-c013-5cc5-5e1c-71ee90111d8f@grimberg.me>
On Wed, Sep 07, 2022 at 06:16:05PM +0300, Sagi Grimberg wrote:
>
> > > > From: Israel Rukshin <israelr@nvidia.com>
> > > >
> > > > Add debug prints for fatal QP events that are helpful for finding the
> > > > root cause of the errors. The ib_get_qp_err_syndrome is called at
> > > > a work queue since the QP event callback is running on an
> > > > interrupt context that can't sleep.
> > > >
> > > > Signed-off-by: Israel Rukshin <israelr@nvidia.com>
> > > > Reviewed-by: Max Gurtovoy <mgurtovoy@nvidia.com>
> > > > Reviewed-by: Leon Romanovsky <leonro@nvidia.com>
> > >
> > > What makes nvme-rdma special here? Why do you get this in
> > > nvme-rdma and not srp/iser/nfs-rdma/rds/smc/ipoib etc?
> > >
> > > This entire code needs to move to the rdma core instead
> > > of being leaked to ulps.
> >
> > We can move, but you will lose connection between queue number,
> > caller and error itself.
>
> That still doesn't explain why nvme-rdma is special.
It was important for us to get proper review from at least one ULP,
nvme-rdma is not special at all.
>
> In any event, the ulp can log the qpn so the context can be interrogated
> if that is important.
ok
prev parent reply other threads:[~2022-09-07 17:30 UTC|newest]
Thread overview: 16+ messages / expand[flat|nested] mbox.gz Atom feed top
2022-09-07 11:37 [PATCH rdma-next 0/4] Provide more error details when a QP moves to Patrisious Haddad
2022-09-07 11:37 ` [PATCH rdma-next 1/4] net/mlx5: Introduce CQE error syndrome Patrisious Haddad
2022-09-07 11:37 ` [PATCH rdma-next 2/4] RDMA/core: Introduce ib_get_qp_err_syndrome function Patrisious Haddad
2022-09-07 11:37 ` [PATCH rdma-next 3/4] RDMA/mlx5: Implement ib_get_qp_err_syndrome Patrisious Haddad
2022-09-07 11:38 ` [PATCH rdma-next 4/4] nvme-rdma: add more error details when a QP moves to an error state Patrisious Haddad
2022-09-07 12:02 ` Christoph Hellwig
2022-09-07 12:11 ` Leon Romanovsky
2022-09-07 12:34 ` Sagi Grimberg
2022-09-07 12:51 ` Leon Romanovsky
2022-09-07 15:16 ` Sagi Grimberg
2022-09-07 15:18 ` Christoph Hellwig
2022-09-07 17:39 ` Leon Romanovsky
2022-11-01 9:12 ` Mark Zhang
2022-11-02 1:56 ` Mark Zhang
2022-09-08 7:55 ` Patrisious Haddad
2022-09-07 17:29 ` Leon Romanovsky [this message]
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=YxjVFZYMLh19vwNR@unreal \
--to=leon@kernel.org \
--cc=hch@lst.de \
--cc=israelr@nvidia.com \
--cc=linux-nvme@lists.infradead.org \
--cc=linux-rdma@vger.kernel.org \
--cc=maorg@nvidia.com \
--cc=mgurtovoy@nvidia.com \
--cc=michaelgur@nvidia.com \
--cc=phaddad@nvidia.com \
--cc=sagi@grimberg.me \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.