public inbox for linux-rdma@vger.kernel.org
 help / color / mirror / Atom feed
From: Kleber Sacilotto de Souza <klebers-23VcF4HTsmIX0ybBhKVfKdBPR1lH4CV8@public.gmane.org>
To: Roland Dreier <roland-DgEjT+Ai2ygdnm+yROfE0A@public.gmane.org>
Cc: Or Gerlitz <or.gerlitz-Re5JQEeQqe8AvxtiuMwx3w@public.gmane.org>,
	"linux-rdma-u79uwXL29TY76Z2rM5mHXA@public.gmane.org"
	<linux-rdma-u79uwXL29TY76Z2rM5mHXA@public.gmane.org>,
	Sean Hefty <sean.hefty-ral2JQCrhuEAvxtiuMwx3w@public.gmane.org>,
	Hal Rosenstock
	<hal.rosenstock-Re5JQEeQqe8AvxtiuMwx3w@public.gmane.org>,
	Or Gerlitz <ogerlitz-VPRAkNaXOzVWk0Htik3J/w@public.gmane.org>,
	Jack Morgenstein
	<jackm-LDSdmyG8hGV8YrgS2mwiifqBs+8SCbDb@public.gmane.org>
Subject: Re: [PATCH] IB/mlx4: Fail post send command on error recovery
Date: Mon, 08 Apr 2013 11:07:05 -0300	[thread overview]
Message-ID: <5162CF09.1010509@linux.vnet.ibm.com> (raw)
In-Reply-To: <CAL1RGDU=hn7hMy0ECQ7AOQqmuB8R6+BT6JUudNS_6rPBKr2UtQ-JsoAwUIsXosN+BqQ9rBEUg@public.gmane.org>

On 04/04/2013 06:49 PM, Roland Dreier wrote:
> 
> I don't know so much about this PCI error recovery stuff but it does
> seem sensible to trigger a catastrophic error async event when it
> happens (I'm assuming the recovery mechanism resets the adapter).

The PCI error recovery in the powerpc architecture, which is where I'm
focusing, works by identifying a misbehaving adapter and freezing its
slot, so that all MMIO writes to that device will be ignored and reads
will return all 1's. When that happens the Linux implementation will
invoke some callbacks on the driver (in this case mlx4_core) to recover
from the error, and reset the slot. The most common procedure is the
driver to remove the adapter and add it back, which is what the mlx4_ib
is trying to do.

> 
> Then we should fix at least kernel ULPs behave appropriately when they
> get such an async event.  And similarly if someone wants to harden
> some subset of userspace apps to handle PCI error recovery too, that
> would be another step forward.
> 

I agree, this seems to be what is missing to have the error recovery
fully functional.


Thanks,

-- 
Kleber Sacilotto de Souza
IBM Linux Technology Center

--
To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

      parent reply	other threads:[~2013-04-08 14:07 UTC|newest]

Thread overview: 12+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2013-03-28 18:45 [PATCH] IB/mlx4: Fail post send command on error recovery Kleber Sacilotto de Souza
     [not found] ` <1364496315-7588-1-git-send-email-klebers-23VcF4HTsmIX0ybBhKVfKdBPR1lH4CV8@public.gmane.org>
2013-04-02  9:15   ` Or Gerlitz
     [not found]     ` <515AA1C6.7070804-VPRAkNaXOzVWk0Htik3J/w@public.gmane.org>
2013-04-02 11:24       ` Jack Morgenstein
2013-04-02 17:00   ` Roland Dreier
     [not found]     ` <CAL1RGDW7wMVmyFhCv-Ei8Mbca-Y9yv+nygzfREU2_TozNSZ60A-JsoAwUIsXosN+BqQ9rBEUg@public.gmane.org>
2013-04-04 13:01       ` Kleber Sacilotto de Souza
     [not found]         ` <515D79B3.4090808-23VcF4HTsmIX0ybBhKVfKdBPR1lH4CV8@public.gmane.org>
2013-04-04 14:00           ` Jack Morgenstein
     [not found]             ` <201304041700.40349.jackm-LDSdmyG8hGV8YrgS2mwiifqBs+8SCbDb@public.gmane.org>
2013-04-04 21:45               ` Or Gerlitz
     [not found]                 ` <CAJZOPZ+dgtQRX_sfcDc=aSOW553Twi0oqNjREeminQ2tnZeEmQ-JsoAwUIsXosN+BqQ9rBEUg@public.gmane.org>
2013-04-08 13:51                   ` Kleber Sacilotto de Souza
     [not found]                     ` <5162CB51.3080600-23VcF4HTsmIX0ybBhKVfKdBPR1lH4CV8@public.gmane.org>
2013-04-08 15:47                       ` Or Gerlitz
2013-04-04 21:45           ` Or Gerlitz
     [not found]             ` <CAJZOPZLgCMDmTO-qqZXm9Y9xv+xCh5bezz3A_nBn6BEtB-G+0A-JsoAwUIsXosN+BqQ9rBEUg@public.gmane.org>
2013-04-04 21:49               ` Roland Dreier
     [not found]                 ` <CAL1RGDU=hn7hMy0ECQ7AOQqmuB8R6+BT6JUudNS_6rPBKr2UtQ-JsoAwUIsXosN+BqQ9rBEUg@public.gmane.org>
2013-04-08 14:07                   ` Kleber Sacilotto de Souza [this message]

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=5162CF09.1010509@linux.vnet.ibm.com \
    --to=klebers-23vcf4htsmix0ybbhkvfkdbpr1lh4cv8@public.gmane.org \
    --cc=hal.rosenstock-Re5JQEeQqe8AvxtiuMwx3w@public.gmane.org \
    --cc=jackm-LDSdmyG8hGV8YrgS2mwiifqBs+8SCbDb@public.gmane.org \
    --cc=linux-rdma-u79uwXL29TY76Z2rM5mHXA@public.gmane.org \
    --cc=ogerlitz-VPRAkNaXOzVWk0Htik3J/w@public.gmane.org \
    --cc=or.gerlitz-Re5JQEeQqe8AvxtiuMwx3w@public.gmane.org \
    --cc=roland-DgEjT+Ai2ygdnm+yROfE0A@public.gmane.org \
    --cc=sean.hefty-ral2JQCrhuEAvxtiuMwx3w@public.gmane.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox