On Tue, Jun 07, 2016 at 10:50:54AM -0400, Doug Ledford wrote: > On 6/7/2016 10:43 AM, Doug Ledford wrote: > > On 6/7/2016 10:32 AM, Erez Shitrit wrote: > >> On Tue, Jun 7, 2016 at 4:48 PM, Doug Ledford wrote: > >>> On 6/7/2016 2:01 AM, Erez Shitrit wrote: > >>>> On Tue, Jun 7, 2016 at 2:59 AM, Doug Ledford wrote: > >>>>> On 6/5/2016 5:10 PM, Or Gerlitz wrote: > >>>>> > >>>>>>> - neigh->alive = jiffies; > >>>>>>> + > >>>>>>> + if (likely(skb_queue_len(&neigh->queue) < IPOIB_MAX_PATH_REC_QUEUE)) > >>>>>>> + neigh->alive = jiffies; > >>>>>> > >>>>>> why testing the queue len makes things right? > >>>>> > >>>>> I'm with Or on this one. This test doesn't make sense unless you assume > >>>>> that the neighbor is being hit with a steady stream of packets. If > >>>>> someone just runs ping -c 1 to a dead neighbor, this test will fail. > >>>>> The idea of the patch is OK, but the test needs to be reworked for > >>>>> situations other than a blasting stream of data. > >>>>> > >>>> > >>>> I will try to explain the idea behind that test, and why it works (I > >>>> hope I'll make it clear this time :)) > >>>> > >>>> there are 2 objects that are taking place here, the garbage-collector > >>>> that removes neigh if was not used for defined time, and the > >>>> path-query for each neigh. > >>>> if the path-query failed the packets for specific neigh are kept in > >>>> the neigh queue, but only if there are no more than > >>>> IPOIB_MAX_PATH_REC_QUEUE, and the neigh is still kept for ever (let's > >>>> assume CM connected that was stopped without any notification with > >>>> DREQ etc.) > >>>> The only way to know that there is an unresolved neigh is by checking > >>>> its queue, if it is full we can assume that this is an unresolved > >>>> neigh otherwise it is resolved. > >>> > >>> That's not true. Reread what I wrote above (I was specific when I > >>> picked that command). Ping -c 1 will only send one ping. It will not > >>> trip this test. As I said, your test relies on a stream of packets, a > >>> single packet to a gone neighbor will never cause it to get deleted. > >> > >> If you ping one time (ping -c 1) there is no problem at all, the neigh > >> will be deleted by the GC anyway, because no (unresolved) packet > >> updates the neigh validity and the GC will check the last time it was > >> refreshed and since only one ping refreshed it long ago, it will be > >> deleted. (the GC always running) > >> > >> the problem is when there are non stop traffic to that neigh. IMHO > >> that test works for that. > > > > Ok, that makes sense. > > Sorry, in case it wasn't clear, I grabbed this patch now (although I > reworded the commit message significantly). Thank you, I truly appreciate your help. > >