From: Or Gerlitz <ogerlitz-smomgflXvOZWk0Htik3J/w@public.gmane.org>
To: Eli Cohen <eli-LDSdmyG8hGV8YrgS2mwiifqBs+8SCbDb@public.gmane.org>,
Bart Van Assche <bvanassche-HInyCGIudOg@public.gmane.org>
Cc: Ralph Campbell
<ralph.campbell-h88ZbnxC6KDQT0dZR+AlfA@public.gmane.org>,
Linux-RDMA <linux-rdma-u79uwXL29TY76Z2rM5mHXA@public.gmane.org>
Subject: Re: Work completions generated after a queue pair has made the transition to an error state
Date: Wed, 13 Oct 2010 15:51:10 +0200 [thread overview]
Message-ID: <4CB5B94E.4080802@voltaire.com> (raw)
In-Reply-To: <20101012202221.GD1617@mtldesk30>
Eli Cohen wrote:
> Completions with non-zero (error) status and a wr_id / opcode
> combination were received that were never queued by the application.
> In case of error the opcode of the completed operation is not provided. I am not sure why.
Eli, there's nothing in the IB spec that mandates the WC.opcode of a non
successful work request to be valid, the only WC fields that must be
valid are the work-request ID (cookie) and the status code, I believe
that hardware vendors would also make sure to have the vendor id valid...
Bart, reading your initial posting, I was under the impression that the
wr_id is something your app didn't post, so in that respect I take back
my response, so, of-course, when you program to IB you can't assume
anything on WC.opcode of an error-ed WR.
Or.
>
>>>> Note: some work requests were queued with and some without the flag
>>>> IB_SEND_SIGNALED. I'm not sure however whether that has anything to do
>>>> with the observed behavior.
> If you have WRs for which you did not set IB_SEND_SIGNALED, they are
> not considered completed before a comletion entry is pushed to the CQ
> that correspnds to that send queue. I am not sure if it means that all
> the WR in the send queue should be completed with error.
>>>> This behavior is easy to reproduce. If I interpret the InfiniBand
>>>> Architecture Specification correctly, this behavior is non-compliant.
>>>>
>>>> Has anyone been looking into this before ?
>>> I haven't seen it. It isn't supposed to happen.
>>>
>>> What hardware and software are you using and how do you
>>> reproduce it?
>> Hello Ralph and Or,
>>
>> The way I reproduce that behavior is by modifying the state of a queue
>> pair into IB_QPS_ERR while RDMA is ongoing. The application, which is
>> multithreaded, performs RDMA by calling ib_post_recv() and
>> ib_post_send() (opcodes IB_WR_SEND, IB_WR_RDMA_READ and
>> IB_WR_RDMA_WRITE). This has been observed with the mlx4 driver, a
>> ConnectX HCA and firmware version 2.7.0.
>>
>> Bart.
>> --
>> To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
>> the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
>> More majordomo info at http://vger.kernel.org/majordomo-info.html
> --
> To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
> the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
> More majordomo info at http://vger.kernel.org/majordomo-info.html
--
To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
next prev parent reply other threads:[~2010-10-13 13:51 UTC|newest]
Thread overview: 12+ messages / expand[flat|nested] mbox.gz Atom feed top
2010-10-12 18:38 Work completions generated after a queue pair has made the transition to an error state Bart Van Assche
[not found] ` <AANLkTimcxsymqmzoki=quCH+a2sq_fPb4YOmf3gqrzqh-JsoAwUIsXosN+BqQ9rBEUg@public.gmane.org>
2010-10-12 18:50 ` Ralph Campbell
[not found] ` <1286909435.27343.93.camel-/vjeY7uYZjrPXfVEPVhPGq6RkeBMCJyt@public.gmane.org>
2010-10-12 18:58 ` Bart Van Assche
[not found] ` <AANLkTi=72Y+coH1Ke4U-Xk7Eaqpw5pipWRqXEQr7dOau-JsoAwUIsXosN+BqQ9rBEUg@public.gmane.org>
2010-10-12 20:22 ` Eli Cohen
2010-10-13 12:37 ` Eli Cohen
2010-10-13 13:51 ` Or Gerlitz [this message]
[not found] ` <4CB5B94E.4080802-smomgflXvOZWk0Htik3J/w@public.gmane.org>
2010-10-13 14:23 ` Eli Cohen
2010-10-13 16:05 ` Roland Dreier
[not found] ` <adawrpmyt5w.fsf-FYB4Gu1CFyUAvxtiuMwx3w@public.gmane.org>
2010-10-13 16:32 ` Eli Cohen
2010-10-13 16:18 ` Roland Dreier
[not found] ` <adapqveyskb.fsf-FYB4Gu1CFyUAvxtiuMwx3w@public.gmane.org>
2010-10-13 16:55 ` Bart Van Assche
2010-10-12 18:52 ` Or Gerlitz
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=4CB5B94E.4080802@voltaire.com \
--to=ogerlitz-smomgflxvozwk0htik3j/w@public.gmane.org \
--cc=bvanassche-HInyCGIudOg@public.gmane.org \
--cc=eli-LDSdmyG8hGV8YrgS2mwiifqBs+8SCbDb@public.gmane.org \
--cc=linux-rdma-u79uwXL29TY76Z2rM5mHXA@public.gmane.org \
--cc=ralph.campbell-h88ZbnxC6KDQT0dZR+AlfA@public.gmane.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox