netdev.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* Fw: [BUGS] [CHECKER] 99 synchronization bugs and a lock summary database
@ 2004-07-02  7:49 Andrew Morton
  2004-07-02 11:04 ` Herbert Xu
  0 siblings, 1 reply; 8+ messages in thread
From: Andrew Morton @ 2004-07-02  7:49 UTC (permalink / raw)
  To: netdev


Could someone please take a look at the locking in
net/ipv4/ipmr.c:ipmr_mfc_seq_next?  It seems rather broken.


Begin forwarded message:

Date: Thu, 1 Jul 2004 18:01:00 -0700 (PDT)
From: Yichen Xie <yxie@cs.stanford.edu>
To: linux-kernel@vger.kernel.org
Subject: [BUGS] [CHECKER] 99 synchronization bugs and a lock summary database


Hi all, 

We are a group of researchers at Stanford working on program analysis
algorithms.  We have been building a precision enhanced program analysis
engine at Stanford, and our first application was to derive mutex/lock
behavior in the linux kernel. In the process, we found 99 likely
synchronization errors in linux kernel version 2.6.5:

    http://glide.stanford.edu/linux-lock/err1.html (69 errors)
    http://glide.stanford.edu/linux-lock/err2.html (30 errors)

err1.html consists of potential double locks/unlocks. Acquiring a lock
twice in a row may result in a system hang, and releasing a lock more than
once with certain mutex functions (e.g. up) may cause critical section
violations.

err2.html consists of reports on inconsistent output lock states on
function exit. These errors usually correspond to missed lock operations
on error paths. (filenames in this report correspond to where a function
is declared, so CTAGS may come in handy to find the actual implementation
of the function).

In the error reports, functions are hyperlinked to their derived
summaries, and those of their callees (since the analysis spans function
calls, the error condition of a particular function usually depend on the
locking behavior of its callees).

For example, in function "radeon_pm_program_v2clk" (first error report in
err1.html), the tool flagged an error at line 323 of
"drivers/video/aty/radeon_pm.c". Line 323 invokes two macros, OUTPLL, and
INPLL. OUTPLL acquires "rinfo->reg_lock", and then evaluates "addr", which
is calculated, in this case, by calling _INPLL. By clicking on the link
"drivers/video/aty/radeon_pm.c:radeon_pm_program_v2clk", we can see that
_INPLL requires "rinfo->reg_lock" be unheld on entry (confirmed by looking
at its definition), which is not satisfied in this example. So this is a
double lock error and could potentially lead to a deadlock on MP systems.

We also have a separate web interface to the summary database at:

	http://glide.stanford.edu/linux-lock/

For example, typing "fh_put" in the input box gives

=========
 SUMMARY 
=========
FUNCTION SUMMARY: 'include/linux/nfsd/nfsfh.h:fh_put'
{
  dcache_lock(global): [unlocked -> unlocked]
  fhp(param#0)->fh_dentry->d_lock: [unlocked -> unlocked]
  fhp(param#0)->fh_dentry->d_inode->i_sem: [locked -> unlocked]
}

Each line in the function summary correspond to the requirements and
effects on one particular lock. For example, fh_put requires that the
global lock variable dcache_lock be unheld on entry, and it'll remain
unheld on exit. It also requires fhp->fh_dentry->d_inode->i_sem be held on
entry and it'll release it on exit. (note: please ignore summaries for
lock premitives like spin_lock or down_interruptible; models for these
functions are built into the checker and the derived summaries are not
used).

We have found that some modules in the kernel has functions with
complicated synchronization behavior (esp. in filesystems), and we hope
summaries generated by this tool could be useful not only for bug finding,
but also for documentation purposes as well.

The analysis is intraprocedurally "path sensitive", so it won't be fooled
by cases like

     if (flag & BLOCKING) spin_lock(&l);
     ...
     if (flag & BLOCKING) spin_unlock(&l);

or

     if (!spin_trylock(&l))
	return -EAGAIN;
     ...
     spin_unlock(&l);

The analysis algorithm models values (down to individual bits) and
pointers in the program with a boolean satisfiability solver with high
precision, and we're actively looking for other properties involving
(heavy) data dependencies where naive analysis would fail. Suggestions and
insights from the linux kernel community will be more than welcome!

As always, feedbacks and confirmations will be greatly appreciated!

Best regards,
Yichen Xie
<yxie@cs.stanford.edu>


^ permalink raw reply	[flat|nested] 8+ messages in thread
[parent not found: <Pine.LNX.4.44.0407011747040.4015-100000@kaki.stanford.edu>]

end of thread, other threads:[~2004-07-03  8:03 UTC | newest]

Thread overview: 8+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2004-07-02  7:49 Fw: [BUGS] [CHECKER] 99 synchronization bugs and a lock summary database Andrew Morton
2004-07-02 11:04 ` Herbert Xu
2004-07-02 19:30   ` Andrew Morton
2004-07-02 19:50     ` Stephen Hemminger
2004-07-03  5:42       ` OGAWA Hirofumi
2004-07-03  6:13         ` Herbert Xu
2004-07-03  8:03           ` OGAWA Hirofumi
     [not found] <Pine.LNX.4.44.0407011747040.4015-100000@kaki.stanford.edu>
2004-07-02  8:47 ` YOSHIFUJI Hideaki / 吉藤英明

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).