From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1754626Ab1HDPOY (ORCPT ); Thu, 4 Aug 2011 11:14:24 -0400 Received: from casper.infradead.org ([85.118.1.10]:43194 "EHLO casper.infradead.org" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1753337Ab1HDPOX convert rfc822-to-8bit (ORCPT ); Thu, 4 Aug 2011 11:14:23 -0400 Subject: Re: select_task_rq_fair: WARNING: at kernel/lockdep.c match_held_lock From: Peter Zijlstra To: Sergey Senozhatsky Cc: Ingo Molnar , Thomas Gleixner , Andrew Morton , linux-kernel@vger.kernel.org, KAMEZAWA Hiroyuki , linux-mm@kvack.org In-Reply-To: <1312470358.16729.25.camel@twins> References: <20110804141306.GA3536@swordfish.minsk.epam.com> <1312470358.16729.25.camel@twins> Content-Type: text/plain; charset="UTF-8" Content-Transfer-Encoding: 8BIT Date: Thu, 04 Aug 2011 17:13:40 +0200 Message-ID: <1312470820.16729.31.camel@twins> Mime-Version: 1.0 X-Mailer: Evolution 2.30.3 Sender: linux-kernel-owner@vger.kernel.org List-ID: X-Mailing-List: linux-kernel@vger.kernel.org On Thu, 2011-08-04 at 17:05 +0200, Peter Zijlstra wrote: > On Thu, 2011-08-04 at 17:13 +0300, Sergey Senozhatsky wrote: > > Hello, > > Got the following trace on 3.0-git19 (07865-g1280ea8): > > > > [ 132.794685] WARNING: at kernel/lockdep.c:3117 match_held_lock+0xf6/0x12e() > > [ 132.794687] Hardware name: Aspire 5741G > > [ 132.794689] Modules linked in: kvm_intel kvm tun ipv6 microcode snd_hda_codec_hdmi snd_hda_codec_realtek broadcom snd_hda_intel snd_hda_codec tg3 snd_pcm snd_timer snd soundcore acer_wmi evdev libphy sparse_keymap psmouse snd_page_alloc > > pcspkr battery ac wmi button ehci_hcd sr_mod cdrom usbcore sd_mod ahci > > [ 132.794731] Pid: 4029, comm: qemu-system-x86 Not tainted 3.1.0-dbg-07865-g1280ea8-dirty #668 > > [ 132.794733] Call Trace: > > [ 132.794736] [] warn_slowpath_common+0x7e/0x96 > > [ 132.794744] [] warn_slowpath_null+0x15/0x17 > > [ 132.794748] [] match_held_lock+0xf6/0x12e > > [ 132.794751] [] lock_is_held+0x62/0xa6 > > [ 132.794757] [] cgroup_lock_is_held+0x10/0x12 > > [ 132.794762] [] set_task_cpu+0x1ac/0x3e3 > > [ 132.794766] [] ? select_task_rq_fair+0x5c0/0x9ca > > [ 132.794769] [] ? try_to_wake_up+0x29/0x28b > > [ 132.794773] [] ? try_to_wake_up+0x29/0x28b > > [ 132.794779] [] ? do_raw_spin_lock+0x6b/0x122 > > [ 132.794783] [] try_to_wake_up+0x19f/0x28b > > [ 132.794787] [] ? update_rmtp+0x65/0x65 > > [ 132.794790] [] wake_up_process+0x10/0x12 > > [ 132.794794] [] hrtimer_wakeup+0x1d/0x21 > > [ 132.794797] [] __run_hrtimer+0x1b1/0x372 > > [ 132.794800] [] hrtimer_interrupt+0xe6/0x1b0 > > [ 132.794805] [] smp_apic_timer_interrupt+0x80/0x93 > > [ 132.794810] [] apic_timer_interrupt+0x73/0x80 > > [ 132.794812] [] ? do_mmu_notifier_register+0x66/0x125 > > [ 132.794822] [] ? mm_take_all_locks+0x10b/0x165 > > [ 132.794826] [] ? mm_take_all_locks+0x139/0x165 > > [ 132.794829] [] ? mm_take_all_locks+0x10b/0x165 > > [ 132.794832] [] do_mmu_notifier_register+0x6e/0x125 > > [ 132.794836] [] mmu_notifier_register+0xe/0x10 > > [ 132.794852] [] kvm_dev_ioctl+0x297/0x400 [kvm] > > [ 132.794857] [] do_vfs_ioctl+0x46c/0x4ad > > [ 132.794862] [] ? fget_light+0xed/0x2a7 > > [ 132.794867] [] ? sysret_check+0x2e/0x69 > > [ 132.794871] [] sys_ioctl+0x51/0x75 > > [ 132.794875] [] system_call_fastpath+0x16/0x1b > > [ 132.794877] ---[ end trace 298584c4014cd2b8 ]--- > > Curious, how easy is that to reproduce? That really shouldn't happen and > its not immediately obvious how it could happen. In particular, mm_take_all_locks() which is called from do_mmu_notifier_register() uses mutex_lock_nest_lock() in both vm_lock_anon_vma() and vm_lock_mapping(), both times using mm->mmap_sem as the nest lock. As per __lock_acquire() any lock that passes in a nest_lock will set hlock->references and also assign this nest_lock to hlock->nest_lock, and as per lock_acquire() all that is done with IRQs disabled, so the interrupt in question should not be able to observe the state where ->references is set, but ->nest_lock is not. So I'm at a loss explaining how match_held_lock() observes exactly that, a lock for which ->references is set, but no ->nest_lock, that should be impossible.