From: Cheng Xu <chengyou@linux.alibaba.com>
To: jgg@ziepe.ca, leon@kernel.org
Cc: linux-rdma@vger.kernel.org, KaiShen@linux.alibaba.com
Subject: [PATCH for-next 0/3] RDMA/erdma: Support flushing all WRs after QP state changed to ERROR
Date: Wed, 16 Nov 2022 10:31:04 +0800 [thread overview]
Message-ID: <20221116023107.82835-1-chengyou@linux.alibaba.com> (raw)
Hi,
This series introduces the support of flushing all WRs posted to hardware
after QP state changed to ERROR.
Old Firmware may not flush the newly posted WRs after QP state chagned to
ERROR, because it's a little difficult for firmware to get the realtime
PI (producer index) of QPs, especially for the RQs.
Previously we want to avoid this issue by implementing custom
drain_{sq/rq} [1], but this has falw, as Tom and Jason pointed out, which
we also meet in some scenarios, for example, NoF fatal recovery.
So, we introduce a new mechanism to fix this. When registering the ibdev,
we create a workqueue for reflushing (we name it "reflush", because
hardware is already start flushing for the QPs at that time, and it's used
for hardware to flush newly posted WRs). Once QP needs to flush WRs, or
new WRs posted after flushing, we post a delay work to the workqueue or
modify the delay time if is already posted. In the work, driver notifies
the lastest PIs to firmware by CMDQ, so that firmware can flush all the
newly posted WRs. This applies to kernel QP first.
- #1 adds a workqueue for WRs reflushing.
- #2 adds a reflushing work for each QP.
- #4 notifies the lastest PIs to firmware for reflushing.
[1] https://lore.kernel.org/all/20220824094251.23190-3-chengyou@linux.alibaba.com/t/
Thanks,
Cheng Xu
Cheng Xu (3):
RDMA/erdma: Add a workqueue for WRs reflushing
RDMA/erdma: Implement the lifecycle of reflushing work for each QP
RDMA/erdma: Notify the latest PI to FW for reflushing when necessary
drivers/infiniband/hw/erdma/erdma.h | 1 +
drivers/infiniband/hw/erdma/erdma_hw.h | 8 ++++++
drivers/infiniband/hw/erdma/erdma_main.c | 14 +++++++++--
drivers/infiniband/hw/erdma/erdma_qp.c | 30 ++++++++++++++++-------
drivers/infiniband/hw/erdma/erdma_verbs.c | 18 ++++++++++++++
drivers/infiniband/hw/erdma/erdma_verbs.h | 7 ++++++
6 files changed, 67 insertions(+), 11 deletions(-)
--
2.27.0
next reply other threads:[~2022-11-16 2:31 UTC|newest]
Thread overview: 5+ messages / expand[flat|nested] mbox.gz Atom feed top
2022-11-16 2:31 Cheng Xu [this message]
2022-11-16 2:31 ` [PATCH for-next 1/3] RDMA/erdma: Add a workqueue for WRs reflushing Cheng Xu
2022-11-16 2:31 ` [PATCH for-next 2/3] RDMA/erdma: Implement the lifecycle of reflushing work for each QP Cheng Xu
2022-11-16 2:31 ` [PATCH for-next 3/3] RDMA/erdma: Notify the latest PI to FW for reflushing when necessary Cheng Xu
2022-11-24 19:00 ` [PATCH for-next 0/3] RDMA/erdma: Support flushing all WRs after QP state changed to ERROR Jason Gunthorpe
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20221116023107.82835-1-chengyou@linux.alibaba.com \
--to=chengyou@linux.alibaba.com \
--cc=KaiShen@linux.alibaba.com \
--cc=jgg@ziepe.ca \
--cc=leon@kernel.org \
--cc=linux-rdma@vger.kernel.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox