public inbox for linux-rdma@vger.kernel.org
 help / color / mirror / Atom feed
From: Or Gerlitz <ogerlitz-smomgflXvOZWk0Htik3J/w@public.gmane.org>
To: Eli Cohen <eli-LDSdmyG8hGV8YrgS2mwiifqBs+8SCbDb@public.gmane.org>,
	Bart Van Assche <bvanassche-HInyCGIudOg@public.gmane.org>
Cc: Ralph Campbell
	<ralph.campbell-h88ZbnxC6KDQT0dZR+AlfA@public.gmane.org>,
	Linux-RDMA <linux-rdma-u79uwXL29TY76Z2rM5mHXA@public.gmane.org>
Subject: Re: Work completions generated after a queue pair has made the	transition to an error state
Date: Wed, 13 Oct 2010 15:51:10 +0200	[thread overview]
Message-ID: <4CB5B94E.4080802@voltaire.com> (raw)
In-Reply-To: <20101012202221.GD1617@mtldesk30>

Eli Cohen wrote:
> Completions with non-zero (error) status and a wr_id / opcode 
> combination were received that were never queued by the application.
> In case of error the opcode of the completed operation is not provided. I am not sure why.
Eli, there's nothing in the IB spec that mandates the WC.opcode of a non 
successful work request to be valid, the only WC fields that must be 
valid are the work-request ID (cookie) and the status code, I believe 
that hardware vendors would also make sure to have the vendor id valid...

Bart, reading your initial posting, I was under the impression that the 
wr_id is something your app didn't post, so in that respect I take back 
my response, so, of-course, when you program to IB you can't assume 
anything on WC.opcode of an error-ed WR.

Or.


>
>>>> Note: some work requests were queued with and some without the flag
>>>> IB_SEND_SIGNALED. I'm not sure however whether that has anything to do
>>>> with the observed behavior.
> If you have WRs for which you did not set IB_SEND_SIGNALED, they are
> not considered completed before a comletion entry is pushed to the CQ
> that correspnds to that send queue. I am not sure if it means that all
> the WR in the send queue should be completed with error.
>>>> This behavior is easy to reproduce. If I interpret the InfiniBand
>>>> Architecture Specification correctly, this behavior is non-compliant.
>>>>
>>>> Has anyone been looking into this before ?
>>> I haven't seen it. It isn't supposed to happen.
>>>
>>> What hardware and software are you using and how do you
>>> reproduce it?
>> Hello Ralph and Or,
>>
>> The way I reproduce that behavior is by modifying the state of a queue
>> pair into IB_QPS_ERR while RDMA is ongoing. The application, which is
>> multithreaded, performs RDMA by calling ib_post_recv() and
>> ib_post_send() (opcodes IB_WR_SEND, IB_WR_RDMA_READ and
>> IB_WR_RDMA_WRITE). This has been observed with the mlx4 driver, a
>> ConnectX HCA and firmware version 2.7.0.
>>
>> Bart.
>> --
>> To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
>> the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
>> More majordomo info at  http://vger.kernel.org/majordomo-info.html
> --
> To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
> the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html

--
To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

  parent reply	other threads:[~2010-10-13 13:51 UTC|newest]

Thread overview: 12+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2010-10-12 18:38 Work completions generated after a queue pair has made the transition to an error state Bart Van Assche
     [not found] ` <AANLkTimcxsymqmzoki=quCH+a2sq_fPb4YOmf3gqrzqh-JsoAwUIsXosN+BqQ9rBEUg@public.gmane.org>
2010-10-12 18:50   ` Ralph Campbell
     [not found]     ` <1286909435.27343.93.camel-/vjeY7uYZjrPXfVEPVhPGq6RkeBMCJyt@public.gmane.org>
2010-10-12 18:58       ` Bart Van Assche
     [not found]         ` <AANLkTi=72Y+coH1Ke4U-Xk7Eaqpw5pipWRqXEQr7dOau-JsoAwUIsXosN+BqQ9rBEUg@public.gmane.org>
2010-10-12 20:22           ` Eli Cohen
2010-10-13 12:37             ` Eli Cohen
2010-10-13 13:51             ` Or Gerlitz [this message]
     [not found]               ` <4CB5B94E.4080802-smomgflXvOZWk0Htik3J/w@public.gmane.org>
2010-10-13 14:23                 ` Eli Cohen
2010-10-13 16:05                   ` Roland Dreier
     [not found]                     ` <adawrpmyt5w.fsf-FYB4Gu1CFyUAvxtiuMwx3w@public.gmane.org>
2010-10-13 16:32                       ` Eli Cohen
2010-10-13 16:18             ` Roland Dreier
     [not found]               ` <adapqveyskb.fsf-FYB4Gu1CFyUAvxtiuMwx3w@public.gmane.org>
2010-10-13 16:55                 ` Bart Van Assche
2010-10-12 18:52   ` Or Gerlitz

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=4CB5B94E.4080802@voltaire.com \
    --to=ogerlitz-smomgflxvozwk0htik3j/w@public.gmane.org \
    --cc=bvanassche-HInyCGIudOg@public.gmane.org \
    --cc=eli-LDSdmyG8hGV8YrgS2mwiifqBs+8SCbDb@public.gmane.org \
    --cc=linux-rdma-u79uwXL29TY76Z2rM5mHXA@public.gmane.org \
    --cc=ralph.campbell-h88ZbnxC6KDQT0dZR+AlfA@public.gmane.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox