Linux RDMA and InfiniBand development
 help / color / mirror / Atom feed
From: Bernard Metzler <bernard.metzler@linux.dev>
To: Stefan Metzmacher <metze@samba.org>, jgg@ziepe.ca, leon@kernel.org
Cc: linux-rdma@vger.kernel.org
Subject: Re: [PATCH] RDMA/siw: Always report immediate post SQ errors
Date: Tue, 23 Sep 2025 18:08:52 +0200	[thread overview]
Message-ID: <18c3ae79-a9f9-4e41-8ecc-c7748ea5157b@linux.dev> (raw)
In-Reply-To: <502b6c51-1585-4496-8984-667cef5c3848@samba.org>

On 23.09.2025 17:44, Stefan Metzmacher wrote:
> Am 23.09.25 um 16:45 schrieb bernard.metzler@linux.dev:
>> From: Bernard Metzler <bernard.metzler@linux.dev>
>>
>> In siw_post_send(), any immediate error encountered during
>> processing of the work request list must be reported to the
>> caller, even if previous work requests in that list were just
>> accepted and added to the send queue.
>> Not reporting those errors confuses the caller, which would
>> wait indefinitely for the failing and potentially subsequently
>> aborted work requests completion.
>> This fixes a case where immediate errors were overwritten
>> by subsequent code in siw_post_send().
>>
>> Fixes: 303ae1cdfdf7 ("rdma/siw: application interface")
>> Suggested-by: Stefan Metzmacher <metze@samba.org>
>> Signed-off-by: Bernard Metzler <bernard.metzler@linux.dev>
>> ---
>>   drivers/infiniband/sw/siw/siw_verbs.c | 25 ++++++++++++++-----------
>>   1 file changed, 14 insertions(+), 11 deletions(-)
>>
>> diff --git a/drivers/infiniband/sw/siw/siw_verbs.c b/drivers/infiniband/sw/siw/siw_verbs.c
>> index 35c3bde0d00a..efa2f097b582 100644
>> --- a/drivers/infiniband/sw/siw/siw_verbs.c
>> +++ b/drivers/infiniband/sw/siw/siw_verbs.c
>> @@ -769,7 +769,7 @@ int siw_post_send(struct ib_qp *base_qp, const struct ib_send_wr *wr,
>>       struct siw_wqe *wqe = tx_wqe(qp);
>>       unsigned long flags;
>> -    int rv = 0;
>> +    int rv = 0, imm_err = 0;
>>       if (wr && !rdma_is_kernel_res(&qp->base_qp.res)) {
>>           siw_dbg_qp(qp, "wr must be empty for user mapped sq\n");
>> @@ -955,9 +955,17 @@ int siw_post_send(struct ib_qp *base_qp, const struct ib_send_wr *wr,
>>        * Send directly if SQ processing is not in progress.
>>        * Eventual immediate errors (rv < 0) do not affect the involved
>>        * RI resources (Verbs, 8.3.1) and thus do not prevent from SQ
>> -     * processing, if new work is already pending. But rv must be passed
>> -     * to caller.
>> +     * processing, if new work is already pending. But rv and pointer
>> +     * to failed work request must be passed to caller.
>>        */
>> +    if (unlikely(rv < 0)) {
>> +        /*
>> +         * Immediate error
>> +         */
>> +        siw_dbg_qp(qp, "Immediate error %d\n", rv);
>> +        imm_err = rv;
>> +        *bad_wr = wr;
>> +    }
>>       if (wqe->wr_status != SIW_WR_IDLE) {
>>           spin_unlock_irqrestore(&qp->sq_lock, flags);
>>           goto skip_direct_sending;
>> @@ -982,15 +990,10 @@ int siw_post_send(struct ib_qp *base_qp, const struct ib_send_wr *wr,
>>       up_read(&qp->state_lock);
>> -    if (rv >= 0)
>> -        return 0;
>> -    /*
>> -     * Immediate error
>> -     */
>> -    siw_dbg_qp(qp, "error %d\n", rv);
>> +    if (unlikely(imm_err))
>> +        return imm_err;
>> -    *bad_wr = wr;
>> -    return rv;
>> +    return (rv >= 0) ? 0 : rv;
>>   }
> 
> 
> So the caller needs to set *bad_rw = NULL itself
> in order to detect if an immediate error happened or not...
> 
> I'm also fine with that, I just hope it's consistent across
> all drivers...
> 
Yes, that's what all providers do, if populating the devices
post_send() method.

Looking at the in-kernel interface, ib_post_send() even
checks if there is a valid bad_wr ptr at all, providing
a dummy pointer otherwise.
That, since all (!) ULP's in infiniband/ulp provide a
NULL bad_wr. They just do not care too much about unwinding
errors to a certain WR, or just send single entry WR
lists.

Thank you!
Bernard.


> Thanks!
> metze


  reply	other threads:[~2025-09-23 16:08 UTC|newest]

Thread overview: 4+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2025-09-23 14:45 [PATCH] RDMA/siw: Always report immediate post SQ errors bernard.metzler
2025-09-23 15:44 ` Stefan Metzmacher
2025-09-23 16:08   ` Bernard Metzler [this message]
2025-09-26 16:15 ` Jason Gunthorpe

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=18c3ae79-a9f9-4e41-8ecc-c7748ea5157b@linux.dev \
    --to=bernard.metzler@linux.dev \
    --cc=jgg@ziepe.ca \
    --cc=leon@kernel.org \
    --cc=linux-rdma@vger.kernel.org \
    --cc=metze@samba.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox