From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1755190Ab0JDLA0 (ORCPT ); Mon, 4 Oct 2010 07:00:26 -0400 Received: from moutng.kundenserver.de ([212.227.17.10]:60438 "EHLO moutng.kundenserver.de" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1754467Ab0JDLAZ (ORCPT ); Mon, 4 Oct 2010 07:00:25 -0400 From: Arnd Bergmann To: Thomas Gleixner Subject: Re: [BUG 2.6.36-rc6] list corruption in module_bug_finalize Date: Mon, 4 Oct 2010 13:00:14 +0200 User-Agent: KMail/1.12.2 (Linux/2.6.35-16-generic; KDE/4.3.2; x86_64; ; ) Cc: LKML , Rusty Russell , Kay Sievers , Brandon Philips References: In-Reply-To: MIME-Version: 1.0 Content-Type: Text/Plain; charset="iso-8859-1" Content-Transfer-Encoding: 7bit Message-Id: <201010041300.15234.arnd@arndb.de> X-Provags-ID: V02:K0:KiI878lOVxXHWayDmDkSszpX0YixcJzq2JYcrlVWmQt AazVFD7wmCloUv/9JkZA64KMqGuJB2ddI0FRmZGbjzFbnWgbvj Ux7s2XaPJs0xC9YSpDFzbP9zoelhiRZnaxVDDYmmGhYwl8cd1t OTz2qWvcs7gIjN5q1TQp/eaer0PIu+xBSbsrvAw4+7W1uyUHlo Yw/ydm3QgH59yZWzWugXw== Sender: linux-kernel-owner@vger.kernel.org List-ID: X-Mailing-List: linux-kernel@vger.kernel.org On Sunday 03 October 2010, Thomas Gleixner wrote: > Current mainline triggers a list corruption bug in > module_bug_finalize(). dmesg excerpt below. > > The corresponding code says: > > /* > * Strictly speaking this should have a spinlock to protect against > * traversals, but since we only traverse on BUG()s, a spinlock > * could potentially lead to deadlock and thus be counter-productive. > */ > list_add(&mod->bug_list, &module_bug_list); > > I can see the traversal problem vs. BUG(), but what's protecting the > list_add() ? BKL probably did, but is that true anymore ? BKL hasn't been in this code path since before git. I think this relatively recent change caused module_finalize to be called without module_mutex held: commit 75676500f8298f0ee89db12db97294883c4b768e Author: Rusty Russell Date: Sat Jun 5 11:17:36 2010 -0600 module: make locking more fine-grained. Kay Sievers reports that we still have some contention over module loading which is slowing boot. Linus also disliked a previous "drop lock and regrab" patch to fix the bne2 "gave up waiting for init of module libcrc32c" message. This is more ambitious: we only grab the lock where we need it. Signed-off-by: Rusty Russell Cc: Brandon Philips Cc: Kay Sievers Cc: Linus Torvalds Arnd