linux-rdma.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: Jason Gunthorpe <jgunthorpe-ePGOBjL8dl3ta4EC/59zMFaTQe2KTcn/@public.gmane.org>
To: Santosh Shilimkar
	<santosh.shilimkar-QHcLZuEGTsvQT0dZR+AlfA@public.gmane.org>
Cc: Yuval Shaia <yuval.shaia-QHcLZuEGTsvQT0dZR+AlfA@public.gmane.org>,
	linux-rdma-u79uwXL29TY76Z2rM5mHXA@public.gmane.org,
	dledford-H+wXaHxf7aLQT0dZR+AlfA@public.gmane.org,
	sean.hefty-ral2JQCrhuEAvxtiuMwx3w@public.gmane.org,
	hal.rosenstock-Re5JQEeQqe8AvxtiuMwx3w@public.gmane.org,
	Haakon Bugge
	<haakon.bugge-QHcLZuEGTsvQT0dZR+AlfA@public.gmane.org>
Subject: Re: [PATCH] IB/ipoib: Skip napi_schedule if ib_poll_cq fails
Date: Wed, 13 Jul 2016 14:53:58 -0600	[thread overview]
Message-ID: <20160713205358.GA27704@obsidianresearch.com> (raw)
In-Reply-To: <02892134-15c7-963a-d13b-95d6e35ceaca-QHcLZuEGTsvQT0dZR+AlfA@public.gmane.org>

On Wed, Jul 13, 2016 at 01:46:07PM -0700, Santosh Shilimkar wrote:
> >Patch does not offer any recovery mechanism it simply print fatal error to
> >console and exit NAPI. This fatal error will suggest admin to reload the
> >driver or something like that.
> >This takes us to "recovery-mechanism" :)
> >I'm not sure that restarting the QP will help as the error is while reading
> >the CQ and restarting the CQ is more or less like restarting the driver.
> >
> Probably Jason mean destroy the problematic CQ and create a new one. This is
> what Haakon suggested as well but it will lead to the leak
> and also possible issue with outstanding WC's getting lost without
> being flushed on that CQ.

Again, this seems crazy.

What failure mode does a CQ have that does not require a full driver
restart after?

Drivers that hard fail a CQ poll should declare themselves dead and
require a full restart to recover. This is the same infrastructure
that was added to the mlx drivers to handle other forms of hard
errors.

This was the same direction we went in for the _destroy functions.

poll_cq should only return -EAGAIN or success (and EAGAIN seems
fairly strange, how is that different from returning 0 wcs?)

If there is some actual reason preventing a driver from implementing
that kind of API lets hear it. Otherwise I recommend you focus patches
on fixing any broken drivers and documenting this requirement.

Jason
--
To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

  parent reply	other threads:[~2016-07-13 20:53 UTC|newest]

Thread overview: 12+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2016-07-13  9:33 [PATCH] IB/ipoib: Skip napi_schedule if ib_poll_cq fails Yuval Shaia
     [not found] ` <1468402436-25053-1-git-send-email-yuval.shaia-QHcLZuEGTsvQT0dZR+AlfA@public.gmane.org>
2016-07-13 10:15   ` kbuild test robot
2016-07-13 17:47   ` Jason Gunthorpe
     [not found]     ` <20160713174742.GE19657-ePGOBjL8dl3ta4EC/59zMFaTQe2KTcn/@public.gmane.org>
2016-07-13 19:12       ` Yuval Shaia
2016-07-13 19:25         ` Jason Gunthorpe
     [not found]           ` <20160713192504.GA26851-ePGOBjL8dl3ta4EC/59zMFaTQe2KTcn/@public.gmane.org>
2016-07-13 19:50             ` Yuval Shaia
     [not found]               ` <20160713195030.GB4929-Hxa29pjIrETlQW142y8m19+IiqhCXseY@public.gmane.org>
2016-07-13 20:46                 ` Santosh Shilimkar
     [not found]                   ` <02892134-15c7-963a-d13b-95d6e35ceaca-QHcLZuEGTsvQT0dZR+AlfA@public.gmane.org>
2016-07-13 20:53                     ` Jason Gunthorpe [this message]
     [not found]                       ` <20160713205358.GA27704-ePGOBjL8dl3ta4EC/59zMFaTQe2KTcn/@public.gmane.org>
2016-07-14 17:12                         ` Håkon Bugge
     [not found]                           ` <5722B9B9-2145-414A-957A-AA5C1C223B31-QHcLZuEGTsvQT0dZR+AlfA@public.gmane.org>
2016-07-14 17:34                             ` Jason Gunthorpe
2016-07-14  5:50                     ` Yuval Shaia
     [not found]                       ` <20160714055028.GA3287-Hxa29pjIrETlQW142y8m19+IiqhCXseY@public.gmane.org>
2016-07-14 16:50                         ` Santosh Shilimkar

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20160713205358.GA27704@obsidianresearch.com \
    --to=jgunthorpe-epgobjl8dl3ta4ec/59zmfatqe2ktcn/@public.gmane.org \
    --cc=dledford-H+wXaHxf7aLQT0dZR+AlfA@public.gmane.org \
    --cc=haakon.bugge-QHcLZuEGTsvQT0dZR+AlfA@public.gmane.org \
    --cc=hal.rosenstock-Re5JQEeQqe8AvxtiuMwx3w@public.gmane.org \
    --cc=linux-rdma-u79uwXL29TY76Z2rM5mHXA@public.gmane.org \
    --cc=santosh.shilimkar-QHcLZuEGTsvQT0dZR+AlfA@public.gmane.org \
    --cc=sean.hefty-ral2JQCrhuEAvxtiuMwx3w@public.gmane.org \
    --cc=yuval.shaia-QHcLZuEGTsvQT0dZR+AlfA@public.gmane.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).