From: Or Gerlitz <ogerlitz-smomgflXvOZWk0Htik3J/w@public.gmane.org>
To: Eli Cohen <eli-LDSdmyG8hGV8YrgS2mwiifqBs+8SCbDb@public.gmane.org>,
Bart Van Assche <bvanassche-HInyCGIudOg@public.gmane.org>
Cc: Ralph Campbell
<ralph.campbell-h88ZbnxC6KDQT0dZR+AlfA@public.gmane.org>,
Linux-RDMA <linux-rdma-u79uwXL29TY76Z2rM5mHXA@public.gmane.org>
Subject: Re: Work completions generated after a queue pair has made the transition to an error state
Date: Wed, 13 Oct 2010 15:51:10 +0200 [thread overview]
Message-ID: <4CB5B94E.4080802@voltaire.com> (raw)
In-Reply-To: <20101012202221.GD1617@mtldesk30>
Eli Cohen wrote:
> Completions with non-zero (error) status and a wr_id / opcode
> combination were received that were never queued by the application.
> In case of error the opcode of the completed operation is not provided. I am not sure why.
Eli, there's nothing in the IB spec that mandates the WC.opcode of a non
successful work request to be valid, the only WC fields that must be
valid are the work-request ID (cookie) and the status code, I believe
that hardware vendors would also make sure to have the vendor id valid...
Bart, reading your initial posting, I was under the impression that the
wr_id is something your app didn't post, so in that respect I take back
my response, so, of-course, when you program to IB you can't assume
anything on WC.opcode of an error-ed WR.
Or.
>
>>>> Note: some work requests were queued with and some without the flag
>>>> IB_SEND_SIGNALED. I'm not sure however whether that has anything to do
>>>> with the observed behavior.
> If you have WRs for which you did not set IB_SEND_SIGNALED, they are
> not considered completed before a comletion entry is pushed to the CQ
> that correspnds to that send queue. I am not sure if it means that all
> the WR in the send queue should be completed with error.
>>>> This behavior is easy to reproduce. If I interpret the InfiniBand
>>>> Architecture Specification correctly, this behavior is non-compliant.
>>>>
>>>> Has anyone been looking into this before ?
>>> I haven't seen it. It isn't supposed to happen.
>>>
>>> What hardware and software are you using and how do you
>>> reproduce it?
>> Hello Ralph and Or,
>>
>> The way I reproduce that behavior is by modifying the state of a queue
>> pair into IB_QPS_ERR while RDMA is ongoing. The application, which is
>> multithreaded, performs RDMA by calling ib_post_recv() and
>> ib_post_send() (opcodes IB_WR_SEND, IB_WR_RDMA_READ and
>> IB_WR_RDMA_WRITE). This has been observed with the mlx4 driver, a
>> ConnectX HCA and firmware version 2.7.0.
>>
>> Bart.
>> --
>> To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
>> the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
>> More majordomo info at http://vger.kernel.org/majordomo-info.html
> --
> To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
> the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
> More majordomo info at http://vger.kernel.org/majordomo-info.html
--
To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
next prev parent reply other threads:[~2010-10-13 13:51 UTC|newest]
Thread overview: 12+ messages / expand[flat|nested] mbox.gz Atom feed top
2010-10-12 18:38 Work completions generated after a queue pair has made the transition to an error state Bart Van Assche
[not found] ` <AANLkTimcxsymqmzoki=quCH+a2sq_fPb4YOmf3gqrzqh-JsoAwUIsXosN+BqQ9rBEUg@public.gmane.org>
2010-10-12 18:50 ` Ralph Campbell
[not found] ` <1286909435.27343.93.camel-/vjeY7uYZjrPXfVEPVhPGq6RkeBMCJyt@public.gmane.org>
2010-10-12 18:58 ` Bart Van Assche
[not found] ` <AANLkTi=72Y+coH1Ke4U-Xk7Eaqpw5pipWRqXEQr7dOau-JsoAwUIsXosN+BqQ9rBEUg@public.gmane.org>
2010-10-12 20:22 ` Eli Cohen
2010-10-13 12:37 ` Eli Cohen
2010-10-13 13:51 ` Or Gerlitz [this message]
[not found] ` <4CB5B94E.4080802-smomgflXvOZWk0Htik3J/w@public.gmane.org>
2010-10-13 14:23 ` Eli Cohen
2010-10-13 16:05 ` Roland Dreier
[not found] ` <adawrpmyt5w.fsf-FYB4Gu1CFyUAvxtiuMwx3w@public.gmane.org>
2010-10-13 16:32 ` Eli Cohen
2010-10-13 16:18 ` Roland Dreier
[not found] ` <adapqveyskb.fsf-FYB4Gu1CFyUAvxtiuMwx3w@public.gmane.org>
2010-10-13 16:55 ` Bart Van Assche
2010-10-12 18:52 ` Or Gerlitz
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=4CB5B94E.4080802@voltaire.com \
--to=ogerlitz-smomgflxvozwk0htik3j/w@public.gmane.org \
--cc=bvanassche-HInyCGIudOg@public.gmane.org \
--cc=eli-LDSdmyG8hGV8YrgS2mwiifqBs+8SCbDb@public.gmane.org \
--cc=linux-rdma-u79uwXL29TY76Z2rM5mHXA@public.gmane.org \
--cc=ralph.campbell-h88ZbnxC6KDQT0dZR+AlfA@public.gmane.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.