From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1751399AbXDRI0w (ORCPT ); Wed, 18 Apr 2007 04:26:52 -0400 Received: (majordomo@vger.kernel.org) by vger.kernel.org id S933030AbXDRI0w (ORCPT ); Wed, 18 Apr 2007 04:26:52 -0400 Received: from stinky.trash.net ([213.144.137.162]:53621 "EHLO stinky.trash.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1750978AbXDRI0v (ORCPT ); Wed, 18 Apr 2007 04:26:51 -0400 Message-ID: <4625D637.2040308@trash.net> Date: Wed, 18 Apr 2007 10:26:31 +0200 From: Patrick McHardy User-Agent: Debian Thunderbird 1.0.7 (X11/20051017) X-Accept-Language: en-us, en MIME-Version: 1.0 To: Evgeniy Polyakov CC: Pavel Emelianov , David Miller , Linux Netdev List , Andrew Morton , Linux Kernel Mailing List , devel@openvz.org, Kirill Korotaev Subject: Re: [NETLINK] Don't attach callback to a going-away netlink socket References: <4625D3D2.9030507@sw.ru> <20070418081707.GA29267@2ka.mipt.ru> In-Reply-To: <20070418081707.GA29267@2ka.mipt.ru> X-Enigmail-Version: 0.93.0.0 Content-Type: text/plain; charset=ISO-8859-15 Content-Transfer-Encoding: 7bit Sender: linux-kernel-owner@vger.kernel.org X-Mailing-List: linux-kernel@vger.kernel.org Evgeniy Polyakov wrote: > On Wed, Apr 18, 2007 at 12:16:18PM +0400, Pavel Emelianov (xemul@sw.ru) wrote: > >>Sorry, I forgot to put netdev and David in Cc when I first sent it. >> >>There is a race between netlink_dump_start() and netlink_release() >>that can lead to the situation when a netlink socket with non-zero >>callback is freed. > > > Out of curiosity, why not to fix a netlink_dump_start() to remove > callback in error path, since in 'no-error' path it removes it in > netlink_dump(). It already does (netlink_destroy_callback), but that doesn't help with this race though since without this patch we don't enter the error path. > And, btw, can release method be called while socket is being used, I > thought about proper reference counters should prevent this, but not > 100% sure with RCU dereferencing of the descriptor. The problem is asynchronous processing of the dump request in the context of a different process. Process requests a dump, message is queued and process returns from sendmsg since some other process is already processing the queue. Then the process closes the socket, resulting in netlink_release being called. When the dump request is finally processed the race Pavel described might happen. This can only happen for netlink families that use mutex_try_lock for queue processing of course.