* [RFC] seqlock,lockdep: Add lock primitives to read_seqbegin().
@ 2011-03-26 4:12 Tetsuo Handa
2011-03-28 17:12 ` Steven Rostedt
` (2 more replies)
0 siblings, 3 replies; 13+ messages in thread
From: Tetsuo Handa @ 2011-03-26 4:12 UTC (permalink / raw)
To: peterz, mingo; +Cc: linux-kernel
I got a freeze between "seqlock_t rename_lock" and "DEFINE_BRLOCK(vfsmount_lock)"
in 2.6.38. The bug was already fixed in linux-2.6 git tree.
http://www.spinics.net/lists/linux-fsdevel/msg43078.html
This bug is timing dependent and lockdep shows nothing before the freeze.
I think that lockdep cannot detect this bug because no lock primitives are
called within read_seqbegin().
/* Lock out other writers and update the count.
* Acts like a normal spin_lock/unlock.
* Don't need preempt_disable() because that is in the spin_lock already.
*/
static inline void write_seqlock(seqlock_t *sl)
{
spin_lock(&sl->lock);
++sl->sequence;
smp_wmb();
}
static inline void write_sequnlock(seqlock_t *sl)
{
smp_wmb();
sl->sequence++;
spin_unlock(&sl->lock);
}
static __always_inline unsigned read_seqbegin(const seqlock_t *sl)
{
unsigned ret;
repeat:
ret = sl->sequence;
smp_rmb();
if (unlikely(ret & 1)) {
cpu_relax();
goto repeat;
}
return ret;
}
read_seqbegin() waits until the writer (if any) calls write_sequnlock().
read_seqbegin() acts like a sort of normal spin_lock()+spin_unlock().
--- linux-2.6.38.1.orig/include/linux/seqlock.h
+++ linux-2.6.38.1/include/linux/seqlock.h
@@ -86,6 +86,11 @@ static inline int write_tryseqlock(seqlo
static __always_inline unsigned read_seqbegin(const seqlock_t *sl)
{
unsigned ret;
+#ifdef CONFIG_PROVE_LOCKING
+ unsigned long flags;
+ spin_lock_irqsave(&((seqlock_t *) sl)->lock, flags);
+ spin_unlock_irqrestore(&((seqlock_t *) sl)->lock, flags);
+#endif
repeat:
ret = sl->sequence;
Therefore, if we apply the patch above, lockdep can detect this bug.
[ 179.457160] =======================================================
[ 179.458004] [ INFO: possible circular locking dependency detected ]
[ 179.458004] 2.6.38.1 #2
[ 179.458004] -------------------------------------------------------
[ 179.458004] ccs_execute_han/3717 is trying to acquire lock:
[ 179.458004] (vfsmount_lock){++++..}, at: [<c04d9910>] vfsmount_lock_local_lock+0x0/0x60
[ 179.458004]
[ 179.458004] but task is already holding lock:
[ 179.458004] (rename_lock){+.+...}, at: [<c04d37dc>] __d_path+0x3c/0x80
[ 179.458004]
[ 179.458004] which lock already depends on the new lock.
[ 179.458004]
[ 179.458004]
[ 179.458004] the existing dependency chain (in reverse order) is:
[ 179.458004]
[ 179.458004] -> #1 (rename_lock){+.+...}:
[ 179.458004] [<c046d5f4>] __lock_acquire+0x244/0x6d0
[ 179.458004] [<c046dafb>] lock_acquire+0x7b/0xa0
[ 179.458004] [<c06c50a5>] _raw_spin_lock_irqsave+0x45/0x80
[ 179.458004] [<c04d308b>] is_subdir+0x2b/0xd0
[ 179.458004] [<c04dac29>] sys_pivot_root+0x219/0x290
[ 179.458004] [<c0402c50>] sysenter_do_call+0x12/0x36
[ 179.458004]
[ 179.458004] -> #0 (vfsmount_lock){++++..}:
[ 179.458004] [<c046d39e>] validate_chain+0x10ee/0x1100
[ 179.458004] [<c046d5f4>] __lock_acquire+0x244/0x6d0
[ 179.458004] [<c046dafb>] lock_acquire+0x7b/0xa0
[ 179.458004] [<c04d9943>] vfsmount_lock_local_lock+0x33/0x60
[ 179.458004] [<c04d284c>] prepend_path+0x1c/0x150
[ 179.458004] [<c04d37f4>] __d_path+0x54/0x80
[ 179.458004] [<e090e1f2>] ccs_realpath_from_path+0x122/0x3e0 [ccsecurity]
[ 179.458004] [<e090eca4>] ccs_get_exe+0x54/0x60 [ccsecurity]
[ 179.458004] [<e090ccdb>] ccs_manager+0x6b/0x1e0 [ccsecurity]
[ 179.458004] [<e090d547>] ccs_write_control+0xc7/0x250 [ccsecurity]
[ 179.458004] [<e090dec8>] ccs_write+0x8/0x10 [ccsecurity]
[ 179.458004] [<c050145d>] proc_reg_write+0x5d/0x90
[ 179.458004] [<c04c0256>] vfs_write+0x96/0x140
[ 179.458004] [<c04c080d>] sys_write+0x3d/0x70
[ 179.458004] [<c0402c50>] sysenter_do_call+0x12/0x36
[ 179.458004]
[ 179.458004] other info that might help us debug this:
[ 179.458004]
[ 179.458004] 4 locks held by ccs_execute_han/3717:
[ 179.458004] #0: (&head->io_sem){+.+.+.}, at: [<e090d4ea>] ccs_write_control+0x6a/0x250 [ccsecurity]
[ 179.458004] #1: (&ccs_ss){.+.+.+}, at: [<e090d4f7>] ccs_write_control+0x77/0x250 [ccsecurity]
[ 179.458004] #2: (&mm->mmap_sem){++++++}, at: [<e090ec72>] ccs_get_exe+0x22/0x60 [ccsecurity]
[ 179.458004] #3: (rename_lock){+.+...}, at: [<c04d37dc>] __d_path+0x3c/0x80
[ 179.458004]
[ 179.458004] stack backtrace:
[ 179.458004] Pid: 3717, comm: ccs_execute_han Not tainted 2.6.38.1 #2
[ 179.458004] Call Trace:
[ 179.458004] [<c046bceb>] ? print_circular_bug+0xbb/0xc0
[ 179.458004] [<c046d39e>] ? validate_chain+0x10ee/0x1100
[ 179.458004] [<c0431216>] ? scheduler_tick+0x146/0x1f0
[ 179.458004] [<c046d5f4>] ? __lock_acquire+0x244/0x6d0
[ 179.458004] [<c046dafb>] ? lock_acquire+0x7b/0xa0
[ 179.458004] [<c04d9910>] ? vfsmount_lock_local_lock+0x0/0x60
[ 179.458004] [<c04d9943>] ? vfsmount_lock_local_lock+0x33/0x60
[ 179.458004] [<c04d9910>] ? vfsmount_lock_local_lock+0x0/0x60
[ 179.458004] [<c04d284c>] ? prepend_path+0x1c/0x150
[ 179.458004] [<c04d37f4>] ? __d_path+0x54/0x80
[ 179.458004] [<e090e1f2>] ? ccs_realpath_from_path+0x122/0x3e0 [ccsecurity]
[ 179.458004] [<e090ec72>] ? ccs_get_exe+0x22/0x60 [ccsecurity]
[ 179.458004] [<c06c42af>] ? down_read+0x7f/0x90
[ 179.458004] [<e090eca4>] ? ccs_get_exe+0x54/0x60 [ccsecurity]
[ 179.458004] [<e090ccdb>] ? ccs_manager+0x6b/0x1e0 [ccsecurity]
[ 179.458004] [<e090d4f7>] ? ccs_write_control+0x77/0x250 [ccsecurity]
[ 179.458004] [<e090d547>] ? ccs_write_control+0xc7/0x250 [ccsecurity]
[ 179.458004] [<e090d4f7>] ? ccs_write_control+0x77/0x250 [ccsecurity]
[ 179.458004] [<e090dec0>] ? ccs_write+0x0/0x10 [ccsecurity]
[ 179.458004] [<e090dec8>] ? ccs_write+0x8/0x10 [ccsecurity]
[ 179.458004] [<c050145d>] ? proc_reg_write+0x5d/0x90
[ 179.458004] [<c04c0256>] ? vfs_write+0x96/0x140
[ 179.458004] [<c04af2e7>] ? sys_mmap_pgoff+0x77/0x100
[ 179.458004] [<c0501400>] ? proc_reg_write+0x0/0x90
[ 179.458004] [<c04c080d>] ? sys_write+0x3d/0x70
[ 179.458004] [<c0402c50>] ? sysenter_do_call+0x12/0x36
But there is still a problem. It seems to me that lockdep checks for this bug
only when a new locking pattern (a locking pattern which was not already added
to lockdep database) is added. This freeze can be triggered by running
while :; do newns /sbin/pivot_root /proc/ /proc/sys/; done
on one terminal and running
while :; do /bin/ls -l /proc/*/exe; done
on another terminal. (The "newns" is a program that unshares the mnt namespace
before execve() using CLONE_NEWNS.) But even after applying the patch above,
lockdep does not show the trace above.
lockdep shows the trace above only after I run test programs for TOMOYO (which
causes a locking pattern that was generated by neither
"/sbin/pivot_root /proc/ /proc/sys/" nor "/bin/ls -l /proc/*/exe" to be added
to lockdep database).
I think that we want some method for rechecking already added locking pattern.
Maybe it is run by (e.g.) every 60 seconds. Maybe it is run when stall checking
mechanisms report the possibility of stall. (The sysrq key didn't work after
the freeze occurred.)
Also, what to do with __read_seqcount_begin() case? Since seqcount_t does not
have a spinlock embedded into the struct, we can't use lock primitives...?
Regards.
^ permalink raw reply [flat|nested] 13+ messages in thread* Re: [RFC] seqlock,lockdep: Add lock primitives to read_seqbegin(). 2011-03-26 4:12 [RFC] seqlock,lockdep: Add lock primitives to read_seqbegin() Tetsuo Handa @ 2011-03-28 17:12 ` Steven Rostedt 2011-03-28 21:57 ` Tetsuo Handa 2011-03-29 4:30 ` Tetsuo Handa 2011-03-29 13:11 ` Peter Zijlstra 2011-03-29 13:14 ` Peter Zijlstra 2 siblings, 2 replies; 13+ messages in thread From: Steven Rostedt @ 2011-03-28 17:12 UTC (permalink / raw) To: Tetsuo Handa; +Cc: peterz, mingo, linux-kernel On Sat, Mar 26, 2011 at 01:12:17PM +0900, Tetsuo Handa wrote: > I got a freeze between "seqlock_t rename_lock" and "DEFINE_BRLOCK(vfsmount_lock)" Egad, that DEFINE_BRLOCK() is a scary macro. Turns into DEFINE_LGLOCK() and creates a local and global lock. > > But there is still a problem. It seems to me that lockdep checks for this bug > only when a new locking pattern (a locking pattern which was not already added > to lockdep database) is added. This freeze can be triggered by running > > while :; do newns /sbin/pivot_root /proc/ /proc/sys/; done > > on one terminal and running > > while :; do /bin/ls -l /proc/*/exe; done > > on another terminal. (The "newns" is a program that unshares the mnt namespace > before execve() using CLONE_NEWNS.) But even after applying the patch above, > lockdep does not show the trace above. > > lockdep shows the trace above only after I run test programs for TOMOYO (which > causes a locking pattern that was generated by neither > "/sbin/pivot_root /proc/ /proc/sys/" nor "/bin/ls -l /proc/*/exe" to be added > to lockdep database). > > I think that we want some method for rechecking already added locking pattern. > Maybe it is run by (e.g.) every 60 seconds. Maybe it is run when stall checking > mechanisms report the possibility of stall. (The sysrq key didn't work after > the freeze occurred.) This is where you confused me. Why would a pattern that was previously checked, suddenly break? If we found A->B and then come across B->A, lockdep should surely catch it. Maybe I don't understand exactly whay you are trying to say. > > Also, what to do with __read_seqcount_begin() case? Since seqcount_t does not > have a spinlock embedded into the struct, we can't use lock primitives...? > We could probably just add a lockdep key to the seqcount, and "acquire" and "release" it when necessary. I can look into that as I'm playing around with lockdep now anyway ;) -- Steve ^ permalink raw reply [flat|nested] 13+ messages in thread
* Re: [RFC] seqlock,lockdep: Add lock primitives to read_seqbegin(). 2011-03-28 17:12 ` Steven Rostedt @ 2011-03-28 21:57 ` Tetsuo Handa 2011-03-29 4:30 ` Tetsuo Handa 1 sibling, 0 replies; 13+ messages in thread From: Tetsuo Handa @ 2011-03-28 21:57 UTC (permalink / raw) To: rostedt; +Cc: peterz, mingo, linux-kernel Steven Rostedt wrote: > This is where you confused me. Why would a pattern that was previously > checked, suddenly break? If we found A->B and then come across B->A, > lockdep should surely catch it. I don't know. This bug can be triggered by simply running while :; do newns /sbin/pivot_root /proc/ /proc/sys/; done and while :; do /bin/ls -l /proc/*/exe; done in parallel. However, lockdep can't catch it even after applying --- linux-2.6.38.1.orig/include/linux/seqlock.h +++ linux-2.6.38.1/include/linux/seqlock.h @@ -86,6 +86,11 @@ static inline int write_tryseqlock(seqlo static __always_inline unsigned read_seqbegin(const seqlock_t *sl) { unsigned ret; +#ifdef CONFIG_PROVE_LOCKING + unsigned long flags; + spin_lock_irqsave(&((seqlock_t *) sl)->lock, flags); + spin_unlock_irqrestore(&((seqlock_t *) sl)->lock, flags); +#endif repeat: ret = sl->sequence; . Maybe usage of spin_lock_irqsave()/spin_unlock_irqrestore() is bad. > Maybe I don't understand exactly whay you are trying to say. Just apply the patch and run the 2 loops above in parallel using 2.6.38.1 will show you. Source code for newns is shown below. ---------- newns.c start ---------- #include <stdio.h> #include <sys/types.h> #include <sys/wait.h> #include <unistd.h> #include <sched.h> #include <errno.h> #include <stdlib.h> static int child(void *arg) { char **argv = (char **) arg; argv++; execvp(argv[0], argv); _exit(1); } int main(int argc, char *argv[]) { char c = 0; char *stack = malloc(8192); const pid_t pid = clone(child, stack + (8192 / 2), CLONE_NEWNS, (void *) argv); while (waitpid(pid, NULL, __WALL) == EOF && errno == EINTR) c++; /* Dummy. */ return 0; } ---------- newns.c end ---------- Regards. ^ permalink raw reply [flat|nested] 13+ messages in thread
* Re: [RFC] seqlock,lockdep: Add lock primitives to read_seqbegin(). 2011-03-28 17:12 ` Steven Rostedt 2011-03-28 21:57 ` Tetsuo Handa @ 2011-03-29 4:30 ` Tetsuo Handa 2011-03-29 12:49 ` Steven Rostedt 2011-03-29 13:39 ` Peter Zijlstra 1 sibling, 2 replies; 13+ messages in thread From: Tetsuo Handa @ 2011-03-29 4:30 UTC (permalink / raw) To: rostedt; +Cc: peterz, mingo, linux-kernel I made a small test module (which should work on 2.6.38.2). ---------- locktest.c start ---------- #include <linux/module.h> #include <linux/seqlock.h> #include <linux/lglock.h> #include <linux/proc_fs.h> static seqlock_t seqlock1; static DEFINE_BRLOCK(brlock1); static unsigned int read_seqbegin2(seqlock_t *sl) { unsigned int ret; #if 1 unsigned long flags; spin_lock_irqsave(&sl->lock, flags); spin_unlock_irqrestore(&sl->lock, flags); #endif repeat: ret = sl->sequence; smp_rmb(); if (unlikely(ret & 1)) { cpu_relax(); goto repeat; } return ret; } static int locktest_open1(struct inode *inode, struct file *file) { write_seqlock(&seqlock1); msleep(1000); /* Open /proc/locktest2 while sleeping here. */ br_read_lock(brlock1); br_read_unlock(brlock1); write_sequnlock(&seqlock1); return -EINVAL; } static int locktest_open2(struct inode *inode, struct file *file) { br_write_lock(brlock1); read_seqbegin2(&seqlock1); br_write_unlock(brlock1); return -EINVAL; } static int locktest_open3(struct inode *inode, struct file *file) { static DEFINE_MUTEX(mutex1); mutex_lock(&mutex1); locktest_open1(inode, file); mutex_unlock(&mutex1); return -EINVAL; } static const struct file_operations locktest_operations1 = { .open = locktest_open1 }; static const struct file_operations locktest_operations2 = { .open = locktest_open2 }; static const struct file_operations locktest_operations3 = { .open = locktest_open3 }; static int __init locktest_init(void) { struct proc_dir_entry *entry; seqlock_init(&seqlock1); entry = create_proc_entry("locktest1", 0666, NULL); if (entry) entry->proc_fops = &locktest_operations1; entry = create_proc_entry("locktest2", 0666, NULL); if (entry) entry->proc_fops = &locktest_operations2; entry = create_proc_entry("locktest3", 0666, NULL); if (entry) entry->proc_fops = &locktest_operations3; return 0; } static void __exit locktest_exit(void) { remove_proc_entry("locktest1", NULL); remove_proc_entry("locktest2", NULL); remove_proc_entry("locktest3", NULL); } module_init(locktest_init); module_exit(locktest_exit); MODULE_LICENSE("GPL"); ---------- locktest.c end ---------- ---------- Makefile start ---------- obj-m += locktest.o ---------- Makefile end ---------- Below are my testcases and results (gcc 3.3 / x86_32). Kernel config is at http://I-love.SAKURA.ne.jp/tmp/config-2.6.38.2 . ---------- Testcase 1 ---------- # cat /proc/locktest1 # cat /proc/locktest2 # cat /proc/locktest1 # cat /proc/locktest2 # cat /proc/locktest1 # cat /proc/locktest2 showed nothing in dmesg. ---------- Testcase 2 ---------- # cat /proc/locktest1 # cat /proc/locktest2 # cat /proc/locktest3 showed below message. ======================================================= [ INFO: possible circular locking dependency detected ] 2.6.38.2 #1 ------------------------------------------------------- cat/2892 is trying to acquire lock: (brlock1_lock_dep_map){++++..}, at: [<e08440b0>] brlock1_local_lock+0x0/0x90 [locktest] but task is already holding lock: (&(&(&seqlock1)->lock)->rlock){+.+...}, at: [<e084451d>] locktest_open1+0xd/0x40 [locktest] which lock already depends on the new lock. the existing dependency chain (in reverse order) is: -> #1 (&(&(&seqlock1)->lock)->rlock){+.+...}: [<c106b62b>] check_prevs_add+0xab/0x100 [<c106b9b5>] validate_chain+0x305/0x5a0 [<c106dab4>] __lock_acquire+0x2a4/0x900 [<c106f1ba>] lock_acquire+0x7a/0xa0 [<c13a83e4>] _raw_spin_lock_irqsave+0x44/0x80 [<e08444e9>] read_seqbegin2+0x19/0x40 [locktest] [<e0844562>] locktest_open2+0x12/0x20 [locktest] [<c1113f95>] proc_reg_open+0x65/0xe0 [<c10caf2b>] __dentry_open+0xbb/0x2a0 [<c10cb20e>] nameidata_to_filp+0x5e/0x70 [<c10d8207>] finish_open+0x77/0xf0 [<c10d86e9>] do_filp_open+0x1a9/0x5c0 [<c10cb50c>] do_sys_open+0x5c/0xe0 [<c10cb5b9>] sys_open+0x29/0x40 [<c13a8f81>] syscall_call+0x7/0xb -> #0 (brlock1_lock_dep_map){++++..}: [<c106b4e8>] check_prev_add+0x768/0x800 [<c106b62b>] check_prevs_add+0xab/0x100 [<c106b9b5>] validate_chain+0x305/0x5a0 [<c106dab4>] __lock_acquire+0x2a4/0x900 [<c106f1ba>] lock_acquire+0x7a/0xa0 [<e08440e3>] brlock1_local_lock+0x33/0x90 [locktest] [<e0844532>] locktest_open1+0x22/0x40 [locktest] [<e0844596>] locktest_open3+0x26/0x50 [locktest] [<c1113f95>] proc_reg_open+0x65/0xe0 [<c10caf2b>] __dentry_open+0xbb/0x2a0 [<c10cb20e>] nameidata_to_filp+0x5e/0x70 [<c10d8207>] finish_open+0x77/0xf0 [<c10d86e9>] do_filp_open+0x1a9/0x5c0 [<c10cb50c>] do_sys_open+0x5c/0xe0 [<c10cb5b9>] sys_open+0x29/0x40 [<c13a8f81>] syscall_call+0x7/0xb other info that might help us debug this: 2 locks held by cat/2892: #0: (mutex1){+.+...}, at: [<e084458d>] locktest_open3+0x1d/0x50 [locktest] #1: (&(&(&seqlock1)->lock)->rlock){+.+...}, at: [<e084451d>] locktest_open1+0xd/0x40 [locktest] stack backtrace: Pid: 2892, comm: cat Not tainted 2.6.38.2 #1 Call Trace: [<c1069ff6>] ? print_circular_bug+0xc6/0xd0 [<c106b4e8>] ? check_prev_add+0x768/0x800 [<c106b62b>] ? check_prevs_add+0xab/0x100 [<c106b9b5>] ? validate_chain+0x305/0x5a0 [<c106dab4>] ? __lock_acquire+0x2a4/0x900 [<c104a0ed>] ? destroy_timer_on_stack+0xd/0x10 [<c13a6776>] ? schedule_timeout+0xf6/0x1b0 [<c106f1ba>] ? lock_acquire+0x7a/0xa0 [<e08440b0>] ? brlock1_local_lock+0x0/0x90 [locktest] [<e0844570>] ? locktest_open3+0x0/0x50 [locktest] [<e08440e3>] ? brlock1_local_lock+0x33/0x90 [locktest] [<e08440b0>] ? brlock1_local_lock+0x0/0x90 [locktest] [<c104af75>] ? msleep+0x15/0x20 [<e0844532>] ? locktest_open1+0x22/0x40 [locktest] [<e0844596>] ? locktest_open3+0x26/0x50 [locktest] [<c1113f95>] ? proc_reg_open+0x65/0xe0 [<c10caf2b>] ? __dentry_open+0xbb/0x2a0 [<c10cb20e>] ? nameidata_to_filp+0x5e/0x70 [<c1113f30>] ? proc_reg_open+0x0/0xe0 [<c10d8207>] ? finish_open+0x77/0xf0 [<c10d86e9>] ? do_filp_open+0x1a9/0x5c0 [<c106c97b>] ? trace_hardirqs_off+0xb/0x10 [<c13a853d>] ? _raw_spin_unlock+0x1d/0x20 [<c10e5d81>] ? alloc_fd+0xe1/0x1a0 [<c10cb50c>] ? do_sys_open+0x5c/0xe0 [<c10cb5b9>] ? sys_open+0x29/0x40 [<c13a8f81>] ? syscall_call+0x7/0xb ---------- Testcase 3 ---------- # cat /proc/locktest2 # cat /proc/locktest1 showed below message. ======================================================= [ INFO: possible circular locking dependency detected ] 2.6.38.2 #1 ------------------------------------------------------- cat/2891 is trying to acquire lock: (brlock1_lock_dep_map){++++..}, at: [<e08440b0>] brlock1_local_lock+0x0/0x90 [locktest] but task is already holding lock: (&(&(&seqlock1)->lock)->rlock){+.+...}, at: [<e084451d>] locktest_open1+0xd/0x40 [locktest] which lock already depends on the new lock. the existing dependency chain (in reverse order) is: -> #1 (&(&(&seqlock1)->lock)->rlock){+.+...}: [<c106b62b>] check_prevs_add+0xab/0x100 [<c106b9b5>] validate_chain+0x305/0x5a0 [<c106dab4>] __lock_acquire+0x2a4/0x900 [<c106f1ba>] lock_acquire+0x7a/0xa0 [<c13a83e4>] _raw_spin_lock_irqsave+0x44/0x80 [<e08444e9>] read_seqbegin2+0x19/0x40 [locktest] [<e0844562>] locktest_open2+0x12/0x20 [locktest] [<c1113f95>] proc_reg_open+0x65/0xe0 [<c10caf2b>] __dentry_open+0xbb/0x2a0 [<c10cb20e>] nameidata_to_filp+0x5e/0x70 [<c10d8207>] finish_open+0x77/0xf0 [<c10d86e9>] do_filp_open+0x1a9/0x5c0 [<c10cb50c>] do_sys_open+0x5c/0xe0 [<c10cb5b9>] sys_open+0x29/0x40 [<c13a8f81>] syscall_call+0x7/0xb -> #0 (brlock1_lock_dep_map){++++..}: [<c106b4e8>] check_prev_add+0x768/0x800 [<c106b62b>] check_prevs_add+0xab/0x100 [<c106b9b5>] validate_chain+0x305/0x5a0 [<c106dab4>] __lock_acquire+0x2a4/0x900 [<c106f1ba>] lock_acquire+0x7a/0xa0 [<e08440e3>] brlock1_local_lock+0x33/0x90 [locktest] [<e0844532>] locktest_open1+0x22/0x40 [locktest] [<c1113f95>] proc_reg_open+0x65/0xe0 [<c10caf2b>] __dentry_open+0xbb/0x2a0 [<c10cb20e>] nameidata_to_filp+0x5e/0x70 [<c10d8207>] finish_open+0x77/0xf0 [<c10d86e9>] do_filp_open+0x1a9/0x5c0 [<c10cb50c>] do_sys_open+0x5c/0xe0 [<c10cb5b9>] sys_open+0x29/0x40 [<c13a8f81>] syscall_call+0x7/0xb other info that might help us debug this: 1 lock held by cat/2891: #0: (&(&(&seqlock1)->lock)->rlock){+.+...}, at: [<e084451d>] locktest_open1+0xd/0x40 [locktest] stack backtrace: Pid: 2891, comm: cat Not tainted 2.6.38.2 #1 Call Trace: [<c1069ff6>] ? print_circular_bug+0xc6/0xd0 [<c106b4e8>] ? check_prev_add+0x768/0x800 [<c100590e>] ? dump_trace+0x5e/0xd0 [<c106b62b>] ? check_prevs_add+0xab/0x100 [<c106b9b5>] ? validate_chain+0x305/0x5a0 [<c106d10c>] ? mark_lock+0x21c/0x3c0 [<c106dab4>] ? __lock_acquire+0x2a4/0x900 [<c104a0ed>] ? destroy_timer_on_stack+0xd/0x10 [<c13a6776>] ? schedule_timeout+0xf6/0x1b0 [<c106f1ba>] ? lock_acquire+0x7a/0xa0 [<e08440b0>] ? brlock1_local_lock+0x0/0x90 [locktest] [<e0844510>] ? locktest_open1+0x0/0x40 [locktest] [<e08440e3>] ? brlock1_local_lock+0x33/0x90 [locktest] [<e08440b0>] ? brlock1_local_lock+0x0/0x90 [locktest] [<c104af75>] ? msleep+0x15/0x20 [<e0844532>] ? locktest_open1+0x22/0x40 [locktest] [<c1113f95>] ? proc_reg_open+0x65/0xe0 [<c10caf2b>] ? __dentry_open+0xbb/0x2a0 [<c10cb20e>] ? nameidata_to_filp+0x5e/0x70 [<c1113f30>] ? proc_reg_open+0x0/0xe0 [<c10d8207>] ? finish_open+0x77/0xf0 [<c10d86e9>] ? do_filp_open+0x1a9/0x5c0 [<c106c97b>] ? trace_hardirqs_off+0xb/0x10 [<c13a853d>] ? _raw_spin_unlock+0x1d/0x20 [<c10e5d81>] ? alloc_fd+0xe1/0x1a0 [<c10cb50c>] ? do_sys_open+0x5c/0xe0 [<c10cb5b9>] ? sys_open+0x29/0x40 [<c13a8f81>] ? syscall_call+0x7/0xb ---------- Testcase 4 ---------- Starting # cat /proc/locktest1 from one terminal and starting # cat /proc/locktest2 (before "cat /proc/locktest1" finishes) from another terminal showed below message. ======================================================= [ INFO: possible circular locking dependency detected ] 2.6.38.2 #1 ------------------------------------------------------- cat/2893 is trying to acquire lock: (brlock1_lock_dep_map){++++..}, at: [<e08440b0>] brlock1_local_lock+0x0/0x90 [locktest] but task is already holding lock: (&(&(&seqlock1)->lock)->rlock){+.+...}, at: [<e084451d>] locktest_open1+0xd/0x40 [locktest] which lock already depends on the new lock. the existing dependency chain (in reverse order) is: -> #1 (&(&(&seqlock1)->lock)->rlock){+.+...}: [<c106b62b>] check_prevs_add+0xab/0x100 [<c106b9b5>] validate_chain+0x305/0x5a0 [<c106dab4>] __lock_acquire+0x2a4/0x900 [<c106f1ba>] lock_acquire+0x7a/0xa0 [<c13a83e4>] _raw_spin_lock_irqsave+0x44/0x80 [<e08444e9>] read_seqbegin2+0x19/0x40 [locktest] [<e0844562>] locktest_open2+0x12/0x20 [locktest] [<c1113f95>] proc_reg_open+0x65/0xe0 [<c10caf2b>] __dentry_open+0xbb/0x2a0 [<c10cb20e>] nameidata_to_filp+0x5e/0x70 [<c10d8207>] finish_open+0x77/0xf0 [<c10d86e9>] do_filp_open+0x1a9/0x5c0 [<c10cb50c>] do_sys_open+0x5c/0xe0 [<c10cb5b9>] sys_open+0x29/0x40 [<c13a8f81>] syscall_call+0x7/0xb -> #0 (brlock1_lock_dep_map){++++..}: [<c106b4e8>] check_prev_add+0x768/0x800 [<c106b62b>] check_prevs_add+0xab/0x100 [<c106b9b5>] validate_chain+0x305/0x5a0 [<c106dab4>] __lock_acquire+0x2a4/0x900 [<c106f1ba>] lock_acquire+0x7a/0xa0 [<e08440e3>] brlock1_local_lock+0x33/0x90 [locktest] [<e0844532>] locktest_open1+0x22/0x40 [locktest] [<c1113f95>] proc_reg_open+0x65/0xe0 [<c10caf2b>] __dentry_open+0xbb/0x2a0 [<c10cb20e>] nameidata_to_filp+0x5e/0x70 [<c10d8207>] finish_open+0x77/0xf0 [<c10d86e9>] do_filp_open+0x1a9/0x5c0 [<c10cb50c>] do_sys_open+0x5c/0xe0 [<c10cb5b9>] sys_open+0x29/0x40 [<c13a8f81>] syscall_call+0x7/0xb other info that might help us debug this: 1 lock held by cat/2893: #0: (&(&(&seqlock1)->lock)->rlock){+.+...}, at: [<e084451d>] locktest_open1+0xd/0x40 [locktest] stack backtrace: Pid: 2893, comm: cat Not tainted 2.6.38.2 #1 Call Trace: [<c1069ff6>] ? print_circular_bug+0xc6/0xd0 [<c106b4e8>] ? check_prev_add+0x768/0x800 [<c100590e>] ? dump_trace+0x5e/0xd0 [<c106b62b>] ? check_prevs_add+0xab/0x100 [<c106b9b5>] ? validate_chain+0x305/0x5a0 [<c106d10c>] ? mark_lock+0x21c/0x3c0 [<c106dab4>] ? __lock_acquire+0x2a4/0x900 [<c104a0ed>] ? destroy_timer_on_stack+0xd/0x10 [<c13a6776>] ? schedule_timeout+0xf6/0x1b0 [<c106f1ba>] ? lock_acquire+0x7a/0xa0 [<e08440b0>] ? brlock1_local_lock+0x0/0x90 [locktest] [<e0844510>] ? locktest_open1+0x0/0x40 [locktest] [<e08440e3>] ? brlock1_local_lock+0x33/0x90 [locktest] [<e08440b0>] ? brlock1_local_lock+0x0/0x90 [locktest] [<c104af75>] ? msleep+0x15/0x20 [<e0844532>] ? locktest_open1+0x22/0x40 [locktest] [<c1113f95>] ? proc_reg_open+0x65/0xe0 [<c10caf2b>] ? __dentry_open+0xbb/0x2a0 [<c10cb20e>] ? nameidata_to_filp+0x5e/0x70 [<c1113f30>] ? proc_reg_open+0x0/0xe0 [<c10d8207>] ? finish_open+0x77/0xf0 [<c10d86e9>] ? do_filp_open+0x1a9/0x5c0 [<c106c97b>] ? trace_hardirqs_off+0xb/0x10 [<c13a853d>] ? _raw_spin_unlock+0x1d/0x20 [<c10e5d81>] ? alloc_fd+0xe1/0x1a0 [<c10cb50c>] ? do_sys_open+0x5c/0xe0 [<c10cb5b9>] ? sys_open+0x29/0x40 [<c13a8f81>] ? syscall_call+0x7/0xb BUG: spinlock lockup on CPU#1, cat/2894, e0844a44 Pid: 2894, comm: cat Not tainted 2.6.38.2 #1 Call Trace: [<c11e7d8a>] ? __spin_lock_debug+0xca/0xf0 [<c11e7e19>] ? do_raw_spin_lock+0x69/0xa0 [<c13a8419>] ? _raw_spin_lock_irqsave+0x79/0x80 [<e08444e9>] ? read_seqbegin2+0x19/0x40 [locktest] [<e0844550>] ? locktest_open2+0x0/0x20 [locktest] [<e08444e9>] ? read_seqbegin2+0x19/0x40 [locktest] [<e0844562>] ? locktest_open2+0x12/0x20 [locktest] [<c1113f95>] ? proc_reg_open+0x65/0xe0 [<c10caf2b>] ? __dentry_open+0xbb/0x2a0 [<c10cb20e>] ? nameidata_to_filp+0x5e/0x70 [<c1113f30>] ? proc_reg_open+0x0/0xe0 [<c10d8207>] ? finish_open+0x77/0xf0 [<c10d86e9>] ? do_filp_open+0x1a9/0x5c0 [<c106c97b>] ? trace_hardirqs_off+0xb/0x10 [<c13a853d>] ? _raw_spin_unlock+0x1d/0x20 [<c10e5d81>] ? alloc_fd+0xe1/0x1a0 [<c10cb50c>] ? do_sys_open+0x5c/0xe0 [<c10cb5b9>] ? sys_open+0x29/0x40 [<c13a8f81>] ? syscall_call+0x7/0xb sending NMI to all CPUs: (...snipped...) ---------- Testcase 5 ---------- Using testcase 4, but different result. BUG: spinlock cpu recursion on CPU#0, cat/2894 lock: e0844a44, .magic: dead4ead, .owner: cat/2893, .owner_cpu: 0 Pid: 2894, comm: cat Not tainted 2.6.38.2 #1 Call Trace: [<c11e7cad>] ? spin_bug+0xad/0xc0 [<c11e7e27>] ? do_raw_spin_lock+0x77/0xa0 [<c13a8419>] ? _raw_spin_lock_irqsave+0x79/0x80 [<e08444e9>] ? read_seqbegin2+0x19/0x40 [locktest] [<e0844550>] ? locktest_open2+0x0/0x20 [locktest] [<e08444e9>] ? read_seqbegin2+0x19/0x40 [locktest] [<e0844562>] ? locktest_open2+0x12/0x20 [locktest] [<c1113f95>] ? proc_reg_open+0x65/0xe0 [<c10caf2b>] ? __dentry_open+0xbb/0x2a0 [<c10cb20e>] ? nameidata_to_filp+0x5e/0x70 [<c1113f30>] ? proc_reg_open+0x0/0xe0 [<c10d8207>] ? finish_open+0x77/0xf0 [<c10d86e9>] ? do_filp_open+0x1a9/0x5c0 [<c106c97b>] ? trace_hardirqs_off+0xb/0x10 [<c13a853d>] ? _raw_spin_unlock+0x1d/0x20 [<c10e5d81>] ? alloc_fd+0xe1/0x1a0 [<c10cb50c>] ? do_sys_open+0x5c/0xe0 [<c10cb5b9>] ? sys_open+0x29/0x40 [<c13a8f81>] ? syscall_call+0x7/0xb ---------- Testcase 6 ---------- Same testcase with 4, but different result. BUG: spinlock lockup on CPU#0, cat/2894, e0844a44 Pid: 2894, comm: cat Not tainted 2.6.38.2 #1 Call Trace: [<c11e7d8a>] ? __spin_lock_debug+0xca/0xf0 [<c11e7e19>] ? do_raw_spin_lock+0x69/0xa0 [<c13a8419>] ? _raw_spin_lock_irqsave+0x79/0x80 [<e08444e9>] ? read_seqbegin2+0x19/0x40 [locktest] [<e0844550>] ? locktest_open2+0x0/0x20 [locktest] [<e08444e9>] ? read_seqbegin2+0x19/0x40 [locktest] [<e0844562>] ? locktest_open2+0x12/0x20 [locktest] [<c1113f95>] ? proc_reg_open+0x65/0xe0 [<c10caf2b>] ? __dentry_open+0xbb/0x2a0 [<c10cb20e>] ? nameidata_to_filp+0x5e/0x70 [<c1113f30>] ? proc_reg_open+0x0/0xe0 [<c10d8207>] ? finish_open+0x77/0xf0 [<c10d86e9>] ? do_filp_open+0x1a9/0x5c0 [<c106c97b>] ? trace_hardirqs_off+0xb/0x10 [<c13a853d>] ? _raw_spin_unlock+0x1d/0x20 [<c10e5d81>] ? alloc_fd+0xe1/0x1a0 [<c10cb50c>] ? do_sys_open+0x5c/0xe0 [<c10cb5b9>] ? sys_open+0x29/0x40 [<c13a8f81>] ? syscall_call+0x7/0xb sending NMI to all CPUs: NMI backtrace for cpu 0 (...snipped...) Well... did I misuse spinlock primitives in read_seqbegin2()? Anyway, lockdep should catch testcase 1. Regards. ^ permalink raw reply [flat|nested] 13+ messages in thread
* Re: [RFC] seqlock,lockdep: Add lock primitives to read_seqbegin(). 2011-03-29 4:30 ` Tetsuo Handa @ 2011-03-29 12:49 ` Steven Rostedt 2011-03-29 13:39 ` Peter Zijlstra 1 sibling, 0 replies; 13+ messages in thread From: Steven Rostedt @ 2011-03-29 12:49 UTC (permalink / raw) To: Tetsuo Handa; +Cc: peterz, mingo, linux-kernel On Tue, 2011-03-29 at 13:30 +0900, Tetsuo Handa wrote: > I made a small test module (which should work on 2.6.38.2). Thanks! I'll take a look at this later today. > > ---------- locktest.c start ---------- [...] > ---------- Makefile end ---------- > > Below are my testcases and results (gcc 3.3 / x86_32). > Kernel config is at http://I-love.SAKURA.ne.jp/tmp/config-2.6.38.2 . > > ---------- Testcase 1 ---------- > > # cat /proc/locktest1 > # cat /proc/locktest2 > # cat /proc/locktest1 > # cat /proc/locktest2 > # cat /proc/locktest1 > # cat /proc/locktest2 > > showed nothing in dmesg. > > ---------- Testcase 2 ---------- > > # cat /proc/locktest1 > # cat /proc/locktest2 > # cat /proc/locktest3 > > showed below message. > > ======================================================= > [ INFO: possible circular locking dependency detected ] > 2.6.38.2 #1 > ------------------------------------------------------- [...] > Well... did I misuse spinlock primitives in read_seqbegin2()? > Anyway, lockdep should catch testcase 1. Strange. I'll investigate this further. Thanks, -- Steve ^ permalink raw reply [flat|nested] 13+ messages in thread
* Re: [RFC] seqlock,lockdep: Add lock primitives to read_seqbegin(). 2011-03-29 4:30 ` Tetsuo Handa 2011-03-29 12:49 ` Steven Rostedt @ 2011-03-29 13:39 ` Peter Zijlstra 2011-03-29 17:50 ` Peter Zijlstra 1 sibling, 1 reply; 13+ messages in thread From: Peter Zijlstra @ 2011-03-29 13:39 UTC (permalink / raw) To: Tetsuo Handa; +Cc: rostedt, mingo, linux-kernel On Tue, 2011-03-29 at 13:30 +0900, Tetsuo Handa wrote: > static int locktest_open1(struct inode *inode, struct file *file) > { > write_seqlock(&seqlock1); > msleep(1000); /* Open /proc/locktest2 while sleeping here. */ > br_read_lock(brlock1); > br_read_unlock(brlock1); > write_sequnlock(&seqlock1); > return -EINVAL; > } > > static int locktest_open2(struct inode *inode, struct file *file) > { > br_write_lock(brlock1); > read_seqbegin2(&seqlock1); > br_write_unlock(brlock1); > return -EINVAL; > } > > static int locktest_open3(struct inode *inode, struct file *file) > { > static DEFINE_MUTEX(mutex1); > mutex_lock(&mutex1); > locktest_open1(inode, file); > mutex_unlock(&mutex1); > return -EINVAL; > } That's quite horrid as far as test-cases go, why bother with the userspace part at all? In order to hit your inversion you need to do something like: cat /proc/locktest1 & cat /proc/locktest2 if you do them serialized you'll never hit that inversion. That said, there are some out-standing issues with rw_locks and lockdep, Gautham and I worked on that for a while but we never persevered and finished it.. http://lkml.org/lkml/2009/5/11/203 And I think you're hitting that case. ^ permalink raw reply [flat|nested] 13+ messages in thread
* Re: [RFC] seqlock,lockdep: Add lock primitives to read_seqbegin(). 2011-03-29 13:39 ` Peter Zijlstra @ 2011-03-29 17:50 ` Peter Zijlstra 2011-03-30 8:12 ` Tetsuo Handa 0 siblings, 1 reply; 13+ messages in thread From: Peter Zijlstra @ 2011-03-29 17:50 UTC (permalink / raw) To: Tetsuo Handa; +Cc: rostedt, mingo, linux-kernel On Tue, 2011-03-29 at 15:39 +0200, Peter Zijlstra wrote: > > That said, there are some out-standing issues with rw_locks and lockdep, > Gautham and I worked on that for a while but we never persevered and > finished it.. > > http://lkml.org/lkml/2009/5/11/203 I just did a quick rebase onto tip/master (compile tested only): http://programming.kicks-ass.net/sekrit/patches-lockdep.tar.bz2 That series needs testing and a few patches to extend the lib/locking-selftest* bits to cover the new functionality. ^ permalink raw reply [flat|nested] 13+ messages in thread
* Re: [RFC] seqlock,lockdep: Add lock primitives to read_seqbegin(). 2011-03-29 17:50 ` Peter Zijlstra @ 2011-03-30 8:12 ` Tetsuo Handa 2011-03-30 9:50 ` Peter Zijlstra 0 siblings, 1 reply; 13+ messages in thread From: Tetsuo Handa @ 2011-03-30 8:12 UTC (permalink / raw) To: peterz; +Cc: rostedt, mingo, linux-kernel Peter Zijlstra wrote: > On Tue, 2011-03-29 at 15:39 +0200, Peter Zijlstra wrote: > > > > That said, there are some out-standing issues with rw_locks and lockdep, > > Gautham and I worked on that for a while but we never persevered and > > finished it.. > > > > http://lkml.org/lkml/2009/5/11/203 > > I just did a quick rebase onto tip/master (compile tested only): > > http://programming.kicks-ass.net/sekrit/patches-lockdep.tar.bz2 > > That series needs testing and a few patches to extend the > lib/locking-selftest* bits to cover the new functionality. Thanks, but I didn't apply above tarball to 2.6.38.2 because lockdep selftests failed. Instead, I retested using 2.6.39-rc1 without applying above tarball. > In order to hit your inversion you need to do something like: > > cat /proc/locktest1 & cat /proc/locktest2 > > if you do them serialized you'll never hit that inversion. Yes, I know. But I think that lockdep should report the possibility of hitting that inversion even if I do them serialized. My understanding is that lockdep can report the possibility of deadlock without actually triggering deadlock as long as lockdep annotation is maintained correctly and each annotation is called for at least one time. Am I misunderstanding? I'm exploring this code in order to let lockdep report the possibility of hitting the inversion without actually hitting the inversion. > On Sat, 2011-03-26 at 13:12 +0900, Tetsuo Handa wrote: > > @@ -86,6 +86,11 @@ static inline int write_tryseqlock(seqlo > > static __always_inline unsigned read_seqbegin(const seqlock_t *sl) > > { > > unsigned ret; > > +#ifdef CONFIG_PROVE_LOCKING > > + unsigned long flags; > > + spin_lock_irqsave(&((seqlock_t *) sl)->lock, flags); > > + spin_unlock_irqrestore(&((seqlock_t *) sl)->lock, flags); > > +#endif > > > > repeat: > > ret = sl->sequence; > > That isn't the right way, something like the below would do, however > there's a reason this isn't done, we use these primitives from the VDSO > which means they're not permitted to write to any kernel memory at all. So, this is not a bug but intended coding. Then, we want a comment here why lockdep annotation is missing. > --- a/include/linux/seqlock.h > +++ b/include/linux/seqlock.h > @@ -94,6 +94,7 @@ static __always_inline unsigned read_seqbegin(const seqlock_t *sl) > cpu_relax(); > goto repeat; > } > + rwlock_acquire_read(&sl->lock->dep_map, 0, 0, _RET_IP_); > > return ret; > } > @@ -107,6 +108,8 @@ static __always_inline int read_seqretry(const seqlock_t *sl, unsigned start) > { > smp_rmb(); > > + rwlock_release(&sl->lock->dep_map, 1, _RET_IP_); > + > return unlikely(sl->sequence != start); > } Excuse me, but the lock embedded into seqlock_t is spinlock rather than rwlock. I assume you meant spin_acquire()/spin_release() rather than rwlock_acquire_read()/rwlock_release(). Also, I assume you meant to call spin_acquire() before entering the spin state (as with static inline void __raw_spin_lock(raw_spinlock_t *lock) { preempt_disable(); spin_acquire(&lock->dep_map, 0, 0, _RET_IP_); LOCK_CONTENDED(lock, do_raw_spin_trylock, do_raw_spin_lock); } . Otherwise, lockdep cannot report it when hit this bug upon the first call to this function). In order to avoid writing kernel memory from VDSO code, I copied the original function and added lockdep annotation. ---------- locktest.c (V2) start ---------- #include <linux/module.h> #include <linux/seqlock.h> #include <linux/lglock.h> #include <linux/proc_fs.h> static seqlock_t seqlock1; static DEFINE_BRLOCK(brlock1); static __always_inline unsigned read_seqbegin2(seqlock_t *sl) { unsigned ret; spin_acquire(&sl->lock.dep_map, 0, 0, _RET_IP_); repeat: ret = sl->sequence; smp_rmb(); if (unlikely(ret & 1)) { cpu_relax(); goto repeat; } return ret; } static __always_inline int read_seqretry2(seqlock_t *sl, unsigned start) { smp_rmb(); spin_release(&sl->lock.dep_map, 1, _RET_IP_); return unlikely(sl->sequence != start); } static int locktest_open1(struct inode *inode, struct file *file) { write_seqlock(&seqlock1); msleep(1000); /* Open /proc/locktest2 while sleeping here. */ br_read_lock(brlock1); br_read_unlock(brlock1); write_sequnlock(&seqlock1); return -EINVAL; } static int locktest_open2(struct inode *inode, struct file *file) { unsigned int seq; br_write_lock(brlock1); seq = read_seqbegin2(&seqlock1); read_seqretry2(&seqlock1, seq); br_write_unlock(brlock1); return -EINVAL; } static int locktest_open3(struct inode *inode, struct file *file) { static DEFINE_MUTEX(mutex1); mutex_lock(&mutex1); locktest_open1(inode, file); mutex_unlock(&mutex1); return -EINVAL; } static const struct file_operations locktest_operations1 = { .open = locktest_open1 }; static const struct file_operations locktest_operations2 = { .open = locktest_open2 }; static const struct file_operations locktest_operations3 = { .open = locktest_open3 }; static int __init locktest_init(void) { struct proc_dir_entry *entry; seqlock_init(&seqlock1); entry = create_proc_entry("locktest1", 0666, NULL); if (entry) entry->proc_fops = &locktest_operations1; entry = create_proc_entry("locktest2", 0666, NULL); if (entry) entry->proc_fops = &locktest_operations2; entry = create_proc_entry("locktest3", 0666, NULL); if (entry) entry->proc_fops = &locktest_operations3; return 0; } static void __exit locktest_exit(void) { remove_proc_entry("locktest1", NULL); remove_proc_entry("locktest2", NULL); remove_proc_entry("locktest3", NULL); } module_init(locktest_init); module_exit(locktest_exit); MODULE_LICENSE("GPL"); ---------- locktest.c (V2) end ---------- (... well, I wonder calling spin_release() from read_seqretry2() makes sense because it is legal to call it like while (1) { seq = read_seqbegin(); do_something(); if (read_seqretry(seq)) continue; do_something_else(); if (!read_seqretry(seq)) break; } whereas it is illegal for read_unlock()...) But the result was same. "cat /proc/locktest1 /proc/locktest2" showed nothing in dmesg whereas "cat /proc/locktest2 /proc/locktest1" showed below message. [ 221.461750] [ 221.461752] ======================================================= [ 221.464593] [ INFO: possible circular locking dependency detected ] [ 221.465538] 2.6.39-rc1-ccs #1 [ 221.465538] ------------------------------------------------------- [ 221.465538] cat/3313 is trying to acquire lock: [ 221.465538] (brlock1_lock_dep_map){++++..}, at: [<e08540b0>] brlock1_local_lock+0x0/0x90 [locktest] [ 221.465538] [ 221.465538] but task is already holding lock: [ 221.465538] (&(&(&seqlock1)->lock)->rlock){+.+...}, at: [<e08544dd>] locktest_open1+0xd/0x40 [locktest] [ 221.465538] [ 221.465538] which lock already depends on the new lock. [ 221.465538] [ 221.465538] [ 221.465538] the existing dependency chain (in reverse order) is: [ 221.465538] [ 221.465538] -> #1 (&(&(&seqlock1)->lock)->rlock){+.+...}: [ 221.465538] [<c106c58b>] check_prevs_add+0xab/0x100 [ 221.465538] [<c106c915>] validate_chain+0x305/0x5a0 [ 221.465538] [<c106ea14>] __lock_acquire+0x2a4/0x900 [ 221.465538] [<c107011a>] lock_acquire+0x7a/0xa0 [ 221.465538] [<e0854546>] locktest_open2+0x36/0x70 [locktest] [ 221.465538] [<c1118585>] proc_reg_open+0x65/0xe0 [ 221.465538] [<c10ce90f>] __dentry_open+0x16f/0x2e0 [ 221.465538] [<c10ceb7e>] nameidata_to_filp+0x5e/0x70 [ 221.465538] [<c10dc358>] do_last+0xf8/0x6c0 [ 221.465538] [<c10dc9c6>] path_openat+0xa6/0x340 [ 221.465538] [<c10dccd4>] do_filp_open+0x74/0x80 [ 221.465538] [<c10cf121>] do_sys_open+0x101/0x1a0 [ 221.465538] [<c10cf1e9>] sys_open+0x29/0x40 [ 221.465538] [<c13b4951>] syscall_call+0x7/0xb [ 221.465538] [ 221.465538] -> #0 (brlock1_lock_dep_map){++++..}: [ 221.465538] [<c106c448>] check_prev_add+0x768/0x800 [ 221.465538] [<c106c58b>] check_prevs_add+0xab/0x100 [ 221.465538] [<c106c915>] validate_chain+0x305/0x5a0 [ 221.465538] [<c106ea14>] __lock_acquire+0x2a4/0x900 [ 221.465538] [<c107011a>] lock_acquire+0x7a/0xa0 [ 221.465538] [<e08540e3>] brlock1_local_lock+0x33/0x90 [locktest] [ 221.465538] [<e08544f2>] locktest_open1+0x22/0x40 [locktest] [ 221.465538] [<c1118585>] proc_reg_open+0x65/0xe0 [ 221.465538] [<c10ce90f>] __dentry_open+0x16f/0x2e0 [ 221.465538] [<c10ceb7e>] nameidata_to_filp+0x5e/0x70 [ 221.465538] [<c10dc358>] do_last+0xf8/0x6c0 [ 221.465538] [<c10dc9c6>] path_openat+0xa6/0x340 [ 221.465538] [<c10dccd4>] do_filp_open+0x74/0x80 [ 221.465538] [<c10cf121>] do_sys_open+0x101/0x1a0 [ 221.465538] [<c10cf1e9>] sys_open+0x29/0x40 [ 221.465538] [<c13b4951>] syscall_call+0x7/0xb [ 221.465538] [ 221.465538] other info that might help us debug this: [ 221.465538] [ 221.465538] 1 lock held by cat/3313: [ 221.465538] #0: (&(&(&seqlock1)->lock)->rlock){+.+...}, at: [<e08544dd>] locktest_open1+0xd/0x40 [locktest] [ 221.465538] [ 221.465538] stack backtrace: [ 221.465538] Pid: 3313, comm: cat Not tainted 2.6.39-rc1-ccs #1 [ 221.465538] Call Trace: [ 221.465538] [<c106af56>] print_circular_bug+0xc6/0xd0 [ 221.465538] [<c106c448>] check_prev_add+0x768/0x800 [ 221.465538] [<c107d44b>] ? __module_text_address+0xb/0x50 [ 221.465538] [<c1005d3b>] ? print_context_stack+0x3b/0xa0 [ 221.465538] [<c106c58b>] check_prevs_add+0xab/0x100 [ 221.465538] [<c106c915>] validate_chain+0x305/0x5a0 [ 221.465538] [<c106e06c>] ? mark_lock+0x21c/0x3c0 [ 221.465538] [<c106ea14>] __lock_acquire+0x2a4/0x900 [ 221.465538] [<c1049fed>] ? destroy_timer_on_stack+0xd/0x10 [ 221.465538] [<c13b22b6>] ? schedule_timeout+0xf6/0x1b0 [ 221.465538] [<c107011a>] lock_acquire+0x7a/0xa0 [ 221.465538] [<e08540b0>] ? brlock1_lock_init+0xb0/0xb0 [locktest] [ 221.465538] [<e08544d0>] ? brlock1_global_unlock+0xa0/0xa0 [locktest] [ 221.465538] [<e08540e3>] brlock1_local_lock+0x33/0x90 [locktest] [ 221.465538] [<e08540b0>] ? brlock1_lock_init+0xb0/0xb0 [locktest] [ 221.465538] [<c104ae45>] ? msleep+0x15/0x20 [ 221.465538] [<e08544f2>] locktest_open1+0x22/0x40 [locktest] [ 221.465538] [<c1118585>] proc_reg_open+0x65/0xe0 [ 221.465538] [<c10ce90f>] __dentry_open+0x16f/0x2e0 [ 221.465538] [<c10ceb7e>] nameidata_to_filp+0x5e/0x70 [ 221.465538] [<c1118520>] ? proc_reg_mmap+0x80/0x80 [ 221.465538] [<c10dc358>] do_last+0xf8/0x6c0 [ 221.465538] [<c10dc9c6>] path_openat+0xa6/0x340 [ 221.465538] [<c106d8db>] ? trace_hardirqs_off+0xb/0x10 [ 221.465538] [<c10dccd4>] do_filp_open+0x74/0x80 [ 221.465538] [<c13b3fed>] ? _raw_spin_unlock+0x1d/0x20 [ 221.465538] [<c10ea091>] ? alloc_fd+0xe1/0x1a0 [ 221.465538] [<c10cf121>] do_sys_open+0x101/0x1a0 [ 221.465538] [<c10cfe2b>] ? vfs_write+0x10b/0x130 [ 221.465538] [<c10cf1e9>] sys_open+0x29/0x40 [ 221.465538] [<c13b4951>] syscall_call+0x7/0xb If lockdep is doing a full analysis on every addition, why lockdep can't catch "cat /proc/locktest1 /proc/locktest2" case? (I think the answer is either "annotation I added to read_seqbegin2()/read_seqretry2() is wrong" or "lockdep cannot do full analysis on some cases".) Regards. ^ permalink raw reply [flat|nested] 13+ messages in thread
* Re: [RFC] seqlock,lockdep: Add lock primitives to read_seqbegin(). 2011-03-30 8:12 ` Tetsuo Handa @ 2011-03-30 9:50 ` Peter Zijlstra 2011-03-30 12:17 ` Tetsuo Handa 0 siblings, 1 reply; 13+ messages in thread From: Peter Zijlstra @ 2011-03-30 9:50 UTC (permalink / raw) To: Tetsuo Handa; +Cc: rostedt, mingo, linux-kernel On Wed, 2011-03-30 at 17:12 +0900, Tetsuo Handa wrote: > Peter Zijlstra wrote: > > On Tue, 2011-03-29 at 15:39 +0200, Peter Zijlstra wrote: > > > > > > That said, there are some out-standing issues with rw_locks and lockdep, > > > Gautham and I worked on that for a while but we never persevered and > > > finished it.. > > > > > > http://lkml.org/lkml/2009/5/11/203 > > > > I just did a quick rebase onto tip/master (compile tested only): > > > > http://programming.kicks-ass.net/sekrit/patches-lockdep.tar.bz2 > > > > That series needs testing and a few patches to extend the > > lib/locking-selftest* bits to cover the new functionality. > > Thanks, but I didn't apply above tarball to 2.6.38.2 because lockdep selftests > failed. I probably messed up the last patch, its basically a complete rewrite because lockdep changed significantly between when that series was written and now. > > In order to hit your inversion you need to do something like: > > > > cat /proc/locktest1 & cat /proc/locktest2 > > > > if you do them serialized you'll never hit that inversion. > > Yes, I know. But I think that lockdep should report the possibility of hitting > that inversion even if I do them serialized. True, my bad. > So, this is not a bug but intended coding. Then, we want a comment here why > lockdep annotation is missing. Nah, ideally we'd fix it by making the VDSO code use another primitive. > > --- a/include/linux/seqlock.h > > +++ b/include/linux/seqlock.h > > @@ -94,6 +94,7 @@ static __always_inline unsigned read_seqbegin(const seqlock_t *sl) > > cpu_relax(); > > goto repeat; > > } > > + rwlock_acquire_read(&sl->lock->dep_map, 0, 0, _RET_IP_); > > > > return ret; > > } > > @@ -107,6 +108,8 @@ static __always_inline int read_seqretry(const seqlock_t *sl, unsigned start) > > { > > smp_rmb(); > > > > + rwlock_release(&sl->lock->dep_map, 1, _RET_IP_); > > + > > return unlikely(sl->sequence != start); > > } > > Excuse me, but the lock embedded into seqlock_t is spinlock rather than rwlock. > I assume you meant spin_acquire()/spin_release() rather than > rwlock_acquire_read()/rwlock_release(). No, I meant what I wrote ;-) it doesn't matter to lockdep that its a spinlock (lockdep doesn't even know that) and in fact rwlock_acquire (the write version) is identical to spin_acquire() both acquire the lock in the exclusive state. The read side of seqlocks is a recursive read lock, hence rwlock_acquire_read() > Also, I assume you meant to call > spin_acquire() before entering the spin state (as with > > static inline void __raw_spin_lock(raw_spinlock_t *lock) > { > preempt_disable(); > spin_acquire(&lock->dep_map, 0, 0, _RET_IP_); > LOCK_CONTENDED(lock, do_raw_spin_trylock, do_raw_spin_lock); > } > > . Otherwise, lockdep cannot report it when hit this bug upon the first call to > this function). Huh no, of course not, a seqlock read side cannot contend in the classic sense. ^ permalink raw reply [flat|nested] 13+ messages in thread
* Re: [RFC] seqlock,lockdep: Add lock primitives to read_seqbegin(). 2011-03-30 9:50 ` Peter Zijlstra @ 2011-03-30 12:17 ` Tetsuo Handa 2011-03-31 13:59 ` Peter Zijlstra 0 siblings, 1 reply; 13+ messages in thread From: Tetsuo Handa @ 2011-03-30 12:17 UTC (permalink / raw) To: peterz; +Cc: rostedt, mingo, linux-kernel Peter Zijlstra wrote: > > Also, I assume you meant to call > > spin_acquire() before entering the spin state (as with > > > > static inline void __raw_spin_lock(raw_spinlock_t *lock) > > { > > preempt_disable(); > > spin_acquire(&lock->dep_map, 0, 0, _RET_IP_); > > LOCK_CONTENDED(lock, do_raw_spin_trylock, do_raw_spin_lock); > > } > > > > . Otherwise, lockdep cannot report it when hit this bug upon the first call to > > this function). > > Huh no, of course not, a seqlock read side cannot contend in the classic > sense. I couldn't understand what 'contend' means. I think static __always_inline unsigned read_seqbegin(const seqlock_t *sl) { unsigned ret; repeat: ret = sl->sequence; smp_rmb(); if (unlikely(ret & 1)) { cpu_relax(); goto repeat; } return ret; } is equivalent (except that above one will not write to any kernel memory) to static __always_inline unsigned read_seqbegin(seqlock_t *sl) { unsigned ret; unsigned long flags; spin_lock_irqsave(&sl->lock, flags); ret = sl->sequence; spin_unlock_irqrestore(&sl->lock, flags); return ret; } because read_seqbegin() cannot return to the reader until the writer (if there is one) calls write_sequnlock(). static inline void write_seqlock(seqlock_t *sl) { spin_lock(&sl->lock); ++sl->sequence; smp_wmb(); } static inline void write_sequnlock(seqlock_t *sl) { smp_wmb(); sl->sequence++; spin_unlock(&sl->lock); } Don't we call this situation (a reader thread temporarily behaves like a writer thread who writes nothing) as 'contended'? Anyway, could you show me read_seqbegin2()/read_seqretry2() for testing with locktest module? Regards. ^ permalink raw reply [flat|nested] 13+ messages in thread
* Re: [RFC] seqlock,lockdep: Add lock primitives to read_seqbegin(). 2011-03-30 12:17 ` Tetsuo Handa @ 2011-03-31 13:59 ` Peter Zijlstra 0 siblings, 0 replies; 13+ messages in thread From: Peter Zijlstra @ 2011-03-31 13:59 UTC (permalink / raw) To: Tetsuo Handa; +Cc: rostedt, mingo, linux-kernel On Wed, 2011-03-30 at 21:17 +0900, Tetsuo Handa wrote: > Peter Zijlstra wrote: > > > Also, I assume you meant to call > > > spin_acquire() before entering the spin state (as with > > > > > > static inline void __raw_spin_lock(raw_spinlock_t *lock) > > > { > > > preempt_disable(); > > > spin_acquire(&lock->dep_map, 0, 0, _RET_IP_); > > > LOCK_CONTENDED(lock, do_raw_spin_trylock, do_raw_spin_lock); > > > } > > > > > > . Otherwise, lockdep cannot report it when hit this bug upon the first call to > > > this function). > > > > Huh no, of course not, a seqlock read side cannot contend in the classic > > sense. > > I couldn't understand what 'contend' means. I think > > static __always_inline unsigned read_seqbegin(const seqlock_t *sl) > { > unsigned ret; > repeat: > ret = sl->sequence; > smp_rmb(); > if (unlikely(ret & 1)) { > cpu_relax(); > goto repeat; > } > return ret; > } > > is equivalent (except that above one will not write to any kernel memory) to > > static __always_inline unsigned read_seqbegin(seqlock_t *sl) > { > unsigned ret; > unsigned long flags; > spin_lock_irqsave(&sl->lock, flags); > ret = sl->sequence; > spin_unlock_irqrestore(&sl->lock, flags); > return ret; > } > > because read_seqbegin() cannot return to the reader until the writer (if there > is one) calls write_sequnlock(). It more or less it, but conceptually the seqlock read-side is a non-blocking algorithm and thus doesn't block or contend. The above initial wait is merely an optimization to avoid having to retry, which could be more expensive than simply waiting there. Anyway, all the lockdep contention crap is purely about lockstat and doesn't matter for dependency tracking. > Don't we call this situation (a reader thread temporarily behaves like a writer > thread who writes nothing) as 'contended'? > > Anyway, could you show me read_seqbegin2()/read_seqretry2() for testing with > locktest module? Like I wrote before: > @@ -94,6 +94,7 @@ static __always_inline unsigned read_seqbegin(const seqlock_t *sl) > cpu_relax(); > goto repeat; > } > + rwlock_acquire_read(&sl->lock->dep_map, 0, 0, _RET_IP_); > > return ret; > } > @@ -107,6 +108,8 @@ static __always_inline int read_seqretry(const seqlock_t *sl, unsigned start) > { > smp_rmb(); > > + rwlock_release(&sl->lock->dep_map, 1, _RET_IP_); > + > return unlikely(sl->sequence != start); > } Should do, except that lockdep doesn't properly work for read-recursive locks, which needs to get fixed. ^ permalink raw reply [flat|nested] 13+ messages in thread
* Re: [RFC] seqlock,lockdep: Add lock primitives to read_seqbegin(). 2011-03-26 4:12 [RFC] seqlock,lockdep: Add lock primitives to read_seqbegin() Tetsuo Handa 2011-03-28 17:12 ` Steven Rostedt @ 2011-03-29 13:11 ` Peter Zijlstra 2011-03-29 13:14 ` Peter Zijlstra 2 siblings, 0 replies; 13+ messages in thread From: Peter Zijlstra @ 2011-03-29 13:11 UTC (permalink / raw) To: Tetsuo Handa; +Cc: mingo, linux-kernel, tglx On Sat, 2011-03-26 at 13:12 +0900, Tetsuo Handa wrote: > @@ -86,6 +86,11 @@ static inline int write_tryseqlock(seqlo > static __always_inline unsigned read_seqbegin(const seqlock_t *sl) > { > unsigned ret; > +#ifdef CONFIG_PROVE_LOCKING > + unsigned long flags; > + spin_lock_irqsave(&((seqlock_t *) sl)->lock, flags); > + spin_unlock_irqrestore(&((seqlock_t *) sl)->lock, flags); > +#endif > > repeat: > ret = sl->sequence; That isn't the right way, something like the below would do, however there's a reason this isn't done, we use these primitives from the VDSO which means they're not permitted to write to any kernel memory at all. We could sort that by creating a raw_ variant that doesn't get the annotation, something I think is already done for -rt anyway. --- include/linux/seqlock.h | 13 ++++++++++++- 1 files changed, 12 insertions(+), 1 deletions(-) diff --git a/include/linux/seqlock.h b/include/linux/seqlock.h index e98cd2e..f2bc19c 100644 --- a/include/linux/seqlock.h +++ b/include/linux/seqlock.h @@ -94,6 +94,7 @@ static __always_inline unsigned read_seqbegin(const seqlock_t *sl) cpu_relax(); goto repeat; } + rwlock_acquire_read(&sl->lock->dep_map, 0, 0, _RET_IP_); return ret; } @@ -107,6 +108,8 @@ static __always_inline int read_seqretry(const seqlock_t *sl, unsigned start) { smp_rmb(); + rwlock_release(&sl->lock->dep_map, 1, _RET_IP_); + return unlikely(sl->sequence != start); } @@ -120,6 +123,9 @@ static __always_inline int read_seqretry(const seqlock_t *sl, unsigned start) typedef struct seqcount { unsigned sequence; +#ifdef CONFIG_DEBUG_LOCK_ALLOC + struct lockdep_map dep_map; +#endif } seqcount_t; #define SEQCNT_ZERO { 0 } @@ -148,6 +154,7 @@ static inline unsigned __read_seqcount_begin(const seqcount_t *s) cpu_relax(); goto repeat; } + rwlock_acquire_read(&sl->dep_map, 0, 0, _RET_IP_); return ret; } @@ -183,7 +190,9 @@ static inline unsigned read_seqcount_begin(const seqcount_t *s) */ static inline int __read_seqcount_retry(const seqcount_t *s, unsigned start) { - return unlikely(s->sequence != start); + int ret = unlikely(s->sequence != start); + rwlock_release(&sl->dep_map, 1, _RET_IP_); + return ret; } /** @@ -210,6 +219,7 @@ static inline int read_seqcount_retry(const seqcount_t *s, unsigned start) */ static inline void write_seqcount_begin(seqcount_t *s) { + rwlock_acquire(&sl->dep_map, 0, 0, _RET_IP_); s->sequence++; smp_wmb(); } @@ -218,6 +228,7 @@ static inline void write_seqcount_end(seqcount_t *s) { smp_wmb(); s->sequence++; + rwlock_release(&sl->dep_map, 1, _RET_IP_); } /** ^ permalink raw reply related [flat|nested] 13+ messages in thread
* Re: [RFC] seqlock,lockdep: Add lock primitives to read_seqbegin(). 2011-03-26 4:12 [RFC] seqlock,lockdep: Add lock primitives to read_seqbegin() Tetsuo Handa 2011-03-28 17:12 ` Steven Rostedt 2011-03-29 13:11 ` Peter Zijlstra @ 2011-03-29 13:14 ` Peter Zijlstra 2 siblings, 0 replies; 13+ messages in thread From: Peter Zijlstra @ 2011-03-29 13:14 UTC (permalink / raw) To: Tetsuo Handa; +Cc: mingo, linux-kernel On Sat, 2011-03-26 at 13:12 +0900, Tetsuo Handa wrote: > > But there is still a problem. It seems to me that lockdep checks for this bug > only when a new locking pattern (a locking pattern which was not already added > to lockdep database) is added. That's how it works. Why is that a problem? > This freeze can be triggered by running > > while :; do newns /sbin/pivot_root /proc/ /proc/sys/; done > > on one terminal and running > > while :; do /bin/ls -l /proc/*/exe; done > > on another terminal. (The "newns" is a program that unshares the mnt namespace > before execve() using CLONE_NEWNS.) But even after applying the patch above, > lockdep does not show the trace above. Then you're still missing something. > lockdep shows the trace above only after I run test programs for TOMOYO (which > causes a locking pattern that was generated by neither > "/sbin/pivot_root /proc/ /proc/sys/" nor "/bin/ls -l /proc/*/exe" to be added > to lockdep database). And tomoyo manages to close the cycle for some reason. > I think that we want some method for rechecking already added locking pattern. > Maybe it is run by (e.g.) every 60 seconds. Maybe it is run when stall checking > mechanisms report the possibility of stall. (The sysrq key didn't work after > the freeze occurred.) I don't get this, why would you need to recheck anything? lockdep does a full analysis on every addition, rechecking when nothing changed should not yield another result. ^ permalink raw reply [flat|nested] 13+ messages in thread
end of thread, other threads:[~2011-03-31 14:00 UTC | newest] Thread overview: 13+ messages (download: mbox.gz follow: Atom feed -- links below jump to the message on this page -- 2011-03-26 4:12 [RFC] seqlock,lockdep: Add lock primitives to read_seqbegin() Tetsuo Handa 2011-03-28 17:12 ` Steven Rostedt 2011-03-28 21:57 ` Tetsuo Handa 2011-03-29 4:30 ` Tetsuo Handa 2011-03-29 12:49 ` Steven Rostedt 2011-03-29 13:39 ` Peter Zijlstra 2011-03-29 17:50 ` Peter Zijlstra 2011-03-30 8:12 ` Tetsuo Handa 2011-03-30 9:50 ` Peter Zijlstra 2011-03-30 12:17 ` Tetsuo Handa 2011-03-31 13:59 ` Peter Zijlstra 2011-03-29 13:11 ` Peter Zijlstra 2011-03-29 13:14 ` Peter Zijlstra
This is a public inbox, see mirroring instructions for how to clone and mirror all data and code used for this inbox