All of lore.kernel.org
 help / color / mirror / Atom feed
From: Pradeep Satyanarayana <pradeeps-23VcF4HTsmIX0ybBhKVfKdBPR1lH4CV8@public.gmane.org>
To: Mike Marciniszyn
	<mike.marciniszyn-h88ZbnxC6KDQT0dZR+AlfA@public.gmane.org>
Cc: Roland Dreier <roland-DgEjT+Ai2ygdnm+yROfE0A@public.gmane.org>,
	"linux-rdma-u79uwXL29TY76Z2rM5mHXA@public.gmane.org"
	<linux-rdma-u79uwXL29TY76Z2rM5mHXA@public.gmane.org>,
	Gary Leshner
	<gary.leshner-h88ZbnxC6KDQT0dZR+AlfA@public.gmane.org>,
	Tom Elken <tom.elken-h88ZbnxC6KDQT0dZR+AlfA@public.gmane.org>
Subject: Re: [PATCH] IPoIB: fix faulty list maintenance in path and neigh list
Date: Fri, 18 Feb 2011 18:07:07 -0800	[thread overview]
Message-ID: <4D5F25CB.5000802@linux.vnet.ibm.com> (raw)
In-Reply-To: <35AAF1E4A771E142979F27B51793A4888838446B0D-HolNjIBXvBOXx9kJd3VG2h2eb7JE58TQ@public.gmane.org>

On 02/17/2011 03:34 PM, Mike Marciniszyn wrote:
> We too have had installability, perhaps associated with these lists, but it has been difficult to diagnose.
>
> We duplicate it by forcing dropped packets and seeing the QP's come/go at the rate of 1000s a second because of the 0 rnr_retry and retry counts.  This analysis is in line behind other bug investigations.
>
> The list patch was a result of code inspection.
>
> Ralph's patch predates me.   His appears to move some list inserts to before a post, I'm assuming since an intervening completion could occur, but I haven't studied it in detail to see if any locking prevents it.
>
> I would be interested in Pradeep's test (OS, Hardware, scripts...)

As described in one of my previous mails (in the url given below):

The test is basically to run netperf in a loop from several client 
machines to a server. The server is unloading and reloading the modules 
(basically do an "openibd restart") at random times. The crashes 
recreate in several hours. I used some of the large IBM servers. They 
did not seem to recreate on say smaller blades.

>
> Mike
>
> -----Original Message-----
> From: roland-BHEL68pLQRGGvPXPguhicg@public.gmane.org [mailto:roland-BHEL68pLQRGGvPXPguhicg@public.gmane.org] On Behalf Of Roland Dreier
> Sent: Thursday, February 17, 2011 6:24 PM
> To: Pradeep Satyanarayana
> Cc: Mike Marciniszyn; linux-rdma-u79uwXL29TY76Z2rM5mHXA@public.gmane.org; Gary Leshner; Tom Elken
> Subject: Re: [PATCH] IPoIB: fix faulty list maintenance in path and neigh list
>
>> Yes, that is the crux of the issue. I had missed that ipoib_mcast_free() is
>> only called on remove_list.
>
> So do we have any idea of what this patch is fixing?  Any thoughts from
> the qlogic people involved in this patch?
>
>> While we are discussing IPoIB issues, how about the two other issues that
>> I illustrated previously. One was Ralph Campbell's patch for fixes to
>> ipoib_cm_start_rx_drain() and my questions wrt ipoib_neigh_cleanup()?
>
> I do need to take a good look at Ralph's patches to try and understand them
> and I hope apply them.  Not sure I still have any link to your questions though.

Here is the link to the detailed mail I sent:
http://www.spinics.net/lists/linux-rdma/msg07352.html

Thanks
Pradeep
--
To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

  parent reply	other threads:[~2011-02-19  2:07 UTC|newest]

Thread overview: 9+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2011-02-01 16:12 [PATCH] IPoIB: fix faulty list maintenance in path and neigh list Mike Marciniszyn
     [not found] ` <20110201161247.12671.10028.stgit-hIFRcJ1SNwcXGO8/Qfapyjg/wwJxntczYPYVAmT7z5s@public.gmane.org>
2011-02-15 18:24   ` Roland Dreier
     [not found]     ` <AANLkTin1pudSXZCGY31p2GnY6noWi2DS3y2+Gprj0+ug-JsoAwUIsXosN+BqQ9rBEUg@public.gmane.org>
2011-02-17  0:51       ` Pradeep Satyanarayana
     [not found]         ` <4D5C70F6.3050604-23VcF4HTsmIX0ybBhKVfKdBPR1lH4CV8@public.gmane.org>
2011-02-17  1:15           ` Roland Dreier
     [not found]             ` <AANLkTim-BnQ9tM68eGsCoOVYDRonmEOO6JN+EFT2zdff-JsoAwUIsXosN+BqQ9rBEUg@public.gmane.org>
2011-02-17 19:52               ` Pradeep Satyanarayana
     [not found]                 ` <4D5D7C95.5010408-23VcF4HTsmIX0ybBhKVfKdBPR1lH4CV8@public.gmane.org>
2011-02-17 23:23                   ` Roland Dreier
     [not found]                     ` <AANLkTinyUjhZE_-n8zPJyGLVobaoFVnQ=iC1QMusQH+y-JsoAwUIsXosN+BqQ9rBEUg@public.gmane.org>
2011-02-17 23:34                       ` Mike Marciniszyn
     [not found]                         ` <35AAF1E4A771E142979F27B51793A4888838446B0D-HolNjIBXvBOXx9kJd3VG2h2eb7JE58TQ@public.gmane.org>
2011-02-19  2:07                           ` Pradeep Satyanarayana [this message]
  -- strict thread matches above, loose matches on Subject: below --
2011-02-11 18:09 Pradeep Satyanarayana

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=4D5F25CB.5000802@linux.vnet.ibm.com \
    --to=pradeeps-23vcf4htsmix0ybbhkvfkdbpr1lh4cv8@public.gmane.org \
    --cc=gary.leshner-h88ZbnxC6KDQT0dZR+AlfA@public.gmane.org \
    --cc=linux-rdma-u79uwXL29TY76Z2rM5mHXA@public.gmane.org \
    --cc=mike.marciniszyn-h88ZbnxC6KDQT0dZR+AlfA@public.gmane.org \
    --cc=roland-DgEjT+Ai2ygdnm+yROfE0A@public.gmane.org \
    --cc=tom.elken-h88ZbnxC6KDQT0dZR+AlfA@public.gmane.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.