From: Leon Romanovsky <leon@kernel.org>
To: Sagi Grimberg <sagi@grimberg.me>
Cc: Patrisious Haddad <phaddad@nvidia.com>,
Christoph Hellwig <hch@lst.de>,
Israel Rukshin <israelr@nvidia.com>,
Linux-nvme <linux-nvme@lists.infradead.org>,
linux-rdma@vger.kernel.org,
Michael Guralnik <michaelgur@nvidia.com>,
Maor Gottlieb <maorg@nvidia.com>,
Max Gurtovoy <mgurtovoy@nvidia.com>
Subject: Re: [PATCH rdma-next 4/4] nvme-rdma: add more error details when a QP moves to an error state
Date: Wed, 7 Sep 2022 15:51:16 +0300 [thread overview]
Message-ID: <YxiTxJvDWPaB9iMf@unreal> (raw)
In-Reply-To: <facc31c4-955e-c82e-191b-150313e73f6a@grimberg.me>
On Wed, Sep 07, 2022 at 03:34:21PM +0300, Sagi Grimberg wrote:
>
> > From: Israel Rukshin <israelr@nvidia.com>
> >
> > Add debug prints for fatal QP events that are helpful for finding the
> > root cause of the errors. The ib_get_qp_err_syndrome is called at
> > a work queue since the QP event callback is running on an
> > interrupt context that can't sleep.
> >
> > Signed-off-by: Israel Rukshin <israelr@nvidia.com>
> > Reviewed-by: Max Gurtovoy <mgurtovoy@nvidia.com>
> > Reviewed-by: Leon Romanovsky <leonro@nvidia.com>
>
> What makes nvme-rdma special here? Why do you get this in
> nvme-rdma and not srp/iser/nfs-rdma/rds/smc/ipoib etc?
>
> This entire code needs to move to the rdma core instead
> of being leaked to ulps.
We can move, but you will lose connection between queue number,
caller and error itself.
As I answered to Christoph, we will need to execute query QP command
in a workqueue outside of event handler.
So you will get a print about queue in error state and later you will
see parsed error print somewhere in the dmesg.
Thanks
next prev parent reply other threads:[~2022-09-07 12:51 UTC|newest]
Thread overview: 16+ messages / expand[flat|nested] mbox.gz Atom feed top
2022-09-07 11:37 [PATCH rdma-next 0/4] Provide more error details when a QP moves to Patrisious Haddad
2022-09-07 11:37 ` [PATCH rdma-next 1/4] net/mlx5: Introduce CQE error syndrome Patrisious Haddad
2022-09-07 11:37 ` [PATCH rdma-next 2/4] RDMA/core: Introduce ib_get_qp_err_syndrome function Patrisious Haddad
2022-09-07 11:37 ` [PATCH rdma-next 3/4] RDMA/mlx5: Implement ib_get_qp_err_syndrome Patrisious Haddad
2022-09-07 11:38 ` [PATCH rdma-next 4/4] nvme-rdma: add more error details when a QP moves to an error state Patrisious Haddad
2022-09-07 12:02 ` Christoph Hellwig
2022-09-07 12:11 ` Leon Romanovsky
2022-09-07 12:34 ` Sagi Grimberg
2022-09-07 12:51 ` Leon Romanovsky [this message]
2022-09-07 15:16 ` Sagi Grimberg
2022-09-07 15:18 ` Christoph Hellwig
2022-09-07 17:39 ` Leon Romanovsky
2022-11-01 9:12 ` Mark Zhang
2022-11-02 1:56 ` Mark Zhang
2022-09-08 7:55 ` Patrisious Haddad
2022-09-07 17:29 ` Leon Romanovsky
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=YxiTxJvDWPaB9iMf@unreal \
--to=leon@kernel.org \
--cc=hch@lst.de \
--cc=israelr@nvidia.com \
--cc=linux-nvme@lists.infradead.org \
--cc=linux-rdma@vger.kernel.org \
--cc=maorg@nvidia.com \
--cc=mgurtovoy@nvidia.com \
--cc=michaelgur@nvidia.com \
--cc=phaddad@nvidia.com \
--cc=sagi@grimberg.me \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.