From mboxrd@z Thu Jan 1 00:00:00 1970 From: "Paul E. McKenney" Subject: Re: [PATCH] kvm: use the correct RCU API Date: Tue, 20 Apr 2010 11:42:38 -0700 Message-ID: <20100420184238.GI2628@linux.vnet.ibm.com> References: <4BCC2543.7050104@cn.fujitsu.com> <4BCC2710.8090809@redhat.com> <20100419233522.GO2564@linux.vnet.ibm.com> <4BCD0CF5.3060600@cn.fujitsu.com> Reply-To: paulmck@linux.vnet.ibm.com Mime-Version: 1.0 Content-Type: text/plain; charset=us-ascii Cc: Avi Kivity , Marcelo Tosatti , Ingo Molnar , LKML , kvm@vger.kernel.org To: Lai Jiangshan Return-path: Content-Disposition: inline In-Reply-To: <4BCD0CF5.3060600@cn.fujitsu.com> Sender: linux-kernel-owner@vger.kernel.org List-Id: kvm.vger.kernel.org On Tue, Apr 20, 2010 at 10:09:57AM +0800, Lai Jiangshan wrote: > Paul E. McKenney wrote: > > On Mon, Apr 19, 2010 at 12:49:04PM +0300, Avi Kivity wrote: > >> On 04/19/2010 12:41 PM, Lai Jiangshan wrote: > >>> The RCU/SRCU API have already changed for proving RCU usage. > >>> > >>> I got the following dmesg when PROVE_RCU=y because we used incorrect API. > >>> This patch coverts rcu_deference() to srcu_dereference() or family API. > >>> > >>> =================================================== > >>> [ INFO: suspicious rcu_dereference_check() usage. ] > >>> --------------------------------------------------- > >>> arch/x86/kvm/mmu.c:3020 invoked rcu_dereference_check() without protection! > >>> > >>> other info that might help us debug this: > >>> > >>> > >>> rcu_scheduler_active = 1, debug_locks = 0 > >>> 2 locks held by qemu-system-x86/8550: > >>> #0: (&kvm->slots_lock){+.+.+.}, at: [] kvm_set_memory_region+0x29/0x50 [kvm] > >>> #1: (&(&kvm->mmu_lock)->rlock){+.+...}, at: [] kvm_arch_commit_memory_region+0xa6/0xe2 [kvm] > >>> > >>> stack backtrace: > >>> Pid: 8550, comm: qemu-system-x86 Not tainted 2.6.34-rc4-tip-01028-g939eab1 #27 > >>> Call Trace: > >>> [] lockdep_rcu_dereference+0xaa/0xb3 > >>> [] kvm_mmu_calculate_mmu_pages+0x44/0x7d [kvm] > >>> [] kvm_arch_commit_memory_region+0xb7/0xe2 [kvm] > >>> [] __kvm_set_memory_region+0x636/0x6e2 [kvm] > >>> [] kvm_set_memory_region+0x37/0x50 [kvm] > >>> [] vmx_set_tss_addr+0x46/0x5a [kvm_intel] > >>> [] kvm_arch_vm_ioctl+0x17a/0xcf8 [kvm] > >>> [] ? unlock_page+0x27/0x2c > >>> [] ? __do_fault+0x3a9/0x3e1 > >>> [] kvm_vm_ioctl+0x364/0x38d [kvm] > >>> [] ? up_read+0x23/0x3d > >>> [] vfs_ioctl+0x32/0xa6 > >>> [] do_vfs_ioctl+0x495/0x4db > >>> [] ? fget_light+0xc2/0x241 > >>> [] ? do_sys_open+0x104/0x116 > >>> [] ? retint_swapgs+0xe/0x13 > >>> [] sys_ioctl+0x47/0x6a > >>> [] system_call_fastpath+0x16/0x1b > >>> > >>> > >>> > >>> +static inline struct kvm_memslots *kvm_memslots(struct kvm *kvm) > >>> +{ > >>> + return rcu_dereference_check(kvm->memslots, > >>> + srcu_read_lock_held(&kvm->srcu) > >>> + || lockdep_is_held(&kvm->slots_lock)); > >>> +} > >>> + > >> > >> This open-codes srcu_dereference(). I guess we need an > >> srcu_dereference_check(). Paul? > > > > rcu_dereference_check() is useful when rcu_dereference(), > rcu_dereference_bh(), rcu_dereference_sched() and srcu_dereference() > are not appropriate. > > I think we don't need srcu_dereference_check() nor rcu_dereference_bh_check() > nor rcu_dereference_sched_check(). > > > One is coming in Arnd's sparse-based patchset. It is probably best > > to open-code this in the meantime and clean up later, but I will > > double-check with Arnd. > > > >> btw, perhaps it is possible not to call rcu_dereference from the > >> write paths. > > > > There is an rcu_dereference_protected() on its way to mainline to handle > > the case where the reference is always protected by a lock. Why not > > just access it directly? Because if you do that, the sparse-based checks > > will yell at you. > > > > There is also an rcu_access_pointer() on its way to mainline for cases > > where you only want to test the pointer itself, not dereference it. > > > > Thanx, Paul > > I reviewed the code, the functions can be called from the srcu-read-site > or update-site, rcu_dereference_check() can simplify the code. > > If we use rcu_dereference_protected(), we may need duplicate the functions. We would only use rcu_dereference_protected() in cases where there is no read-side access, so there would be no need for per-read-side versions of this function. > I think there is very small overhead of using rcu_dereference(), so we can > call it from write paths. In many cases, this is quite true. Thanx, Paul