From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1753052AbcCKXuo (ORCPT ); Fri, 11 Mar 2016 18:50:44 -0500 Received: from forward.webhostbox.net ([5.100.155.97]:48788 "EHLO forward.webhostbox.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1752490AbcCKXum (ORCPT ); Fri, 11 Mar 2016 18:50:42 -0500 Date: Fri, 11 Mar 2016 15:50:35 -0800 From: Guenter Roeck To: Jun Wang Cc: linux-kernel@vger.kernel.org, Herbert Xu , Cong Wang Subject: Re: Glibc recvmsg from kernel netlink socket hangs forever Message-ID: <20160311235035.GA18302@roeck-us.net> References: MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: User-Agent: Mutt/1.5.23 (2014-03-12) X-Authenticated_sender: guenter@roeck-us.net X-OutGoing-Spam-Status: No, score=-1.0 X-CMAE-Score: 0 X-CMAE-Analysis: v=2.1 cv=NfdGrz34 c=1 sm=1 tr=0 a=QNED+QcLUkoL9qulTODnwA==:117 a=2cfIYNtKkjgZNaOwnGXpGw==:17 a=L9H7d07YOLsA:10 a=9cW_t1CCXrUA:10 a=s5jvgZ67dGcA:10 a=kj9zAlcOel0A:10 a=7OsogOcEt9IA:10 a=OSs6WsZVrnMkj22xRK4A:9 a=CjuIK1q_8ugA:10 Sender: linux-kernel-owner@vger.kernel.org List-ID: X-Mailing-List: linux-kernel@vger.kernel.org On Fri, Mar 11, 2016 at 11:33:17AM -0800, Jun Wang wrote: > > On 09/25/2015 08:55 AM, Herbert Xu wrote: > >> On Thu, Sep 24, 2015 at 10:34:10PM -0700, Guenter Roeck wrote: > >>> > >>> Any idea what may be needed for 4.1 ? > >>> I am currently trying https://patchwork.ozlabs.org/patch/473041/, > >> > >> This patch should not make any difference on 4.1 and later because > >> 4.1 is where I rewrote rhashtable resizing and it should work (or > >> if it is broken then the latest kernel should be broken too). > >> > > Yes, applying (only) the above patch to 4.1 didn't help. > > > >>> but I have no idea if that will help with the problem we are seeing there. > >> > >> Having looked at your message agin I don't think the issue I > >> alluded to is relevant since the symptom there ought to be a > >> straight kernel lock-up as opposed to just a user-space one because > >> you will end up with the kernel sending a message to itself. > >> > >> And the fact that 4.2 works is more indicative as the bug is > >> present in both 4.1 and 4.2. > >> > >> I'll try to reproduce this in 4.1 as time permits but no promises. > >> > > > > I applied your patches (and a few additional netlink changes from 4.2) > > to our 4.1 branch. I'll let you know if it makes a difference for us. > > > > Thanks, > > Guenter > > Guenter, > > Which additional netlink changes from 4.2 did you patch? We still see > the problem with your test program with 4.1.12 which have the > following two patches mentioned by Herbert Xu on this thread. > Jun, Sorry, I don't recall, and I no longer have access to the kernel since I now work for a different company. I do recall that we had to apply additional patches later, but I don't remember details. Guenter