* Fw: [BUGS] [CHECKER] 99 synchronization bugs and a lock summary database
@ 2004-07-02 7:49 Andrew Morton
2004-07-02 11:04 ` Herbert Xu
0 siblings, 1 reply; 8+ messages in thread
From: Andrew Morton @ 2004-07-02 7:49 UTC (permalink / raw)
To: netdev
Could someone please take a look at the locking in
net/ipv4/ipmr.c:ipmr_mfc_seq_next? It seems rather broken.
Begin forwarded message:
Date: Thu, 1 Jul 2004 18:01:00 -0700 (PDT)
From: Yichen Xie <yxie@cs.stanford.edu>
To: linux-kernel@vger.kernel.org
Subject: [BUGS] [CHECKER] 99 synchronization bugs and a lock summary database
Hi all,
We are a group of researchers at Stanford working on program analysis
algorithms. We have been building a precision enhanced program analysis
engine at Stanford, and our first application was to derive mutex/lock
behavior in the linux kernel. In the process, we found 99 likely
synchronization errors in linux kernel version 2.6.5:
http://glide.stanford.edu/linux-lock/err1.html (69 errors)
http://glide.stanford.edu/linux-lock/err2.html (30 errors)
err1.html consists of potential double locks/unlocks. Acquiring a lock
twice in a row may result in a system hang, and releasing a lock more than
once with certain mutex functions (e.g. up) may cause critical section
violations.
err2.html consists of reports on inconsistent output lock states on
function exit. These errors usually correspond to missed lock operations
on error paths. (filenames in this report correspond to where a function
is declared, so CTAGS may come in handy to find the actual implementation
of the function).
In the error reports, functions are hyperlinked to their derived
summaries, and those of their callees (since the analysis spans function
calls, the error condition of a particular function usually depend on the
locking behavior of its callees).
For example, in function "radeon_pm_program_v2clk" (first error report in
err1.html), the tool flagged an error at line 323 of
"drivers/video/aty/radeon_pm.c". Line 323 invokes two macros, OUTPLL, and
INPLL. OUTPLL acquires "rinfo->reg_lock", and then evaluates "addr", which
is calculated, in this case, by calling _INPLL. By clicking on the link
"drivers/video/aty/radeon_pm.c:radeon_pm_program_v2clk", we can see that
_INPLL requires "rinfo->reg_lock" be unheld on entry (confirmed by looking
at its definition), which is not satisfied in this example. So this is a
double lock error and could potentially lead to a deadlock on MP systems.
We also have a separate web interface to the summary database at:
http://glide.stanford.edu/linux-lock/
For example, typing "fh_put" in the input box gives
=========
SUMMARY
=========
FUNCTION SUMMARY: 'include/linux/nfsd/nfsfh.h:fh_put'
{
dcache_lock(global): [unlocked -> unlocked]
fhp(param#0)->fh_dentry->d_lock: [unlocked -> unlocked]
fhp(param#0)->fh_dentry->d_inode->i_sem: [locked -> unlocked]
}
Each line in the function summary correspond to the requirements and
effects on one particular lock. For example, fh_put requires that the
global lock variable dcache_lock be unheld on entry, and it'll remain
unheld on exit. It also requires fhp->fh_dentry->d_inode->i_sem be held on
entry and it'll release it on exit. (note: please ignore summaries for
lock premitives like spin_lock or down_interruptible; models for these
functions are built into the checker and the derived summaries are not
used).
We have found that some modules in the kernel has functions with
complicated synchronization behavior (esp. in filesystems), and we hope
summaries generated by this tool could be useful not only for bug finding,
but also for documentation purposes as well.
The analysis is intraprocedurally "path sensitive", so it won't be fooled
by cases like
if (flag & BLOCKING) spin_lock(&l);
...
if (flag & BLOCKING) spin_unlock(&l);
or
if (!spin_trylock(&l))
return -EAGAIN;
...
spin_unlock(&l);
The analysis algorithm models values (down to individual bits) and
pointers in the program with a boolean satisfiability solver with high
precision, and we're actively looking for other properties involving
(heavy) data dependencies where naive analysis would fail. Suggestions and
insights from the linux kernel community will be more than welcome!
As always, feedbacks and confirmations will be greatly appreciated!
Best regards,
Yichen Xie
<yxie@cs.stanford.edu>
^ permalink raw reply [flat|nested] 8+ messages in thread
* Re: [BUGS] [CHECKER] 99 synchronization bugs and a lock summary database
[not found] <Pine.LNX.4.44.0407011747040.4015-100000@kaki.stanford.edu>
@ 2004-07-02 8:47 ` YOSHIFUJI Hideaki / 吉藤英明
0 siblings, 0 replies; 8+ messages in thread
From: YOSHIFUJI Hideaki / 吉藤英明 @ 2004-07-02 8:47 UTC (permalink / raw)
To: davem, yxie; +Cc: linux-kernel, netdev
Hello.
In article <Pine.LNX.4.44.0407011747040.4015-100000@kaki.stanford.edu> (at Thu, 1 Jul 2004 18:01:00 -0700 (PDT)), Yichen Xie <yxie@cs.stanford.edu> says:
> http://glide.stanford.edu/linux-lock/err1.html (69 errors)
:
> err1.html consists of potential double locks/unlocks. Acquiring a lock
> twice in a row may result in a system hang, and releasing a lock more than
> once with certain mutex functions (e.g. up) may cause critical section
> violations.
Well,
lapb_iface.c:lapb_unregister() has typo and we failed to get lapb_list_lock.
rose_route.c:rose_remove_node()'s caller has rose_node_list_lock; so, this is
dead lock.
Here's the fix.
===== net/lapb/lapb_iface.c 1.14 vs edited =====
--- 1.14/net/lapb/lapb_iface.c 2004-01-11 08:39:04 +09:00
+++ edited/net/lapb/lapb_iface.c 2004-07-02 17:23:27 +09:00
@@ -176,7 +176,7 @@
struct lapb_cb *lapb;
int rc = LAPB_BADTOKEN;
- write_unlock_bh(&lapb_list_lock);
+ write_lock_bh(&lapb_list_lock);
lapb = __lapb_devtostruct(dev);
if (!lapb)
goto out;
===== net/rose/rose_route.c 1.12 vs edited =====
--- 1.12/net/rose/rose_route.c 2004-06-04 09:11:24 +09:00
+++ edited/net/rose/rose_route.c 2004-07-02 17:26:08 +09:00
@@ -206,7 +206,6 @@
{
struct rose_node *s;
- spin_lock_bh(&rose_node_list_lock);
if ((s = rose_node_list) == rose_node) {
rose_node_list = rose_node->next;
kfree(rose_node);
Thanks.
--
Hideaki YOSHIFUJI @ USAGI Project <yoshfuji@linux-ipv6.org>
GPG FP: 9022 65EB 1ECF 3AD1 0BDF 80D8 4807 F894 E062 0EEA
^ permalink raw reply [flat|nested] 8+ messages in thread
* Re: Fw: [BUGS] [CHECKER] 99 synchronization bugs and a lock summary database
2004-07-02 7:49 Fw: [BUGS] [CHECKER] 99 synchronization bugs and a lock summary database Andrew Morton
@ 2004-07-02 11:04 ` Herbert Xu
2004-07-02 19:30 ` Andrew Morton
0 siblings, 1 reply; 8+ messages in thread
From: Herbert Xu @ 2004-07-02 11:04 UTC (permalink / raw)
To: Andrew Morton; +Cc: netdev
Andrew Morton <akpm@osdl.org> wrote:
>
> Could someone please take a look at the locking in
> net/ipv4/ipmr.c:ipmr_mfc_seq_next? It seems rather broken.
Obfuscated yes, broken no. Unfortunately the seq interface tends to
produce code like this.
Cheers,
--
Visit Openswan at http://www.openswan.org/
Email: Herbert Xu ~{PmV>HI~} <herbert@gondor.apana.org.au>
Home Page: http://gondor.apana.org.au/~herbert/
PGP Key: http://gondor.apana.org.au/~herbert/pubkey.txt
^ permalink raw reply [flat|nested] 8+ messages in thread
* Re: Fw: [BUGS] [CHECKER] 99 synchronization bugs and a lock summary database
2004-07-02 11:04 ` Herbert Xu
@ 2004-07-02 19:30 ` Andrew Morton
2004-07-02 19:50 ` Stephen Hemminger
0 siblings, 1 reply; 8+ messages in thread
From: Andrew Morton @ 2004-07-02 19:30 UTC (permalink / raw)
To: Herbert Xu; +Cc: netdev
Herbert Xu <herbert@gondor.apana.org.au> wrote:
>
> Andrew Morton <akpm@osdl.org> wrote:
> >
> > Could someone please take a look at the locking in
> > net/ipv4/ipmr.c:ipmr_mfc_seq_next? It seems rather broken.
>
> Obfuscated yes, broken no.
Really? Even if that goto
if (it->cache == &mfc_unres_queue)
goto end_of_list;
is taken?
Bizarre. It sure looks wrong. But then, if the author couldn't be
bothered describing the locking, I can't be bothered reverse engineering
it, so there.
> Unfortunately the seq interface tends to produce code like this.
Maybe. It could be that the locking in there is straightforward and
sensible. But because it is also secretly designed, we see what happens -
it wasted the Stanford guys' time, my time, your time, etc.
Sigh.
^ permalink raw reply [flat|nested] 8+ messages in thread
* Re: [BUGS] [CHECKER] 99 synchronization bugs and a lock summary database
2004-07-02 19:30 ` Andrew Morton
@ 2004-07-02 19:50 ` Stephen Hemminger
2004-07-03 5:42 ` OGAWA Hirofumi
0 siblings, 1 reply; 8+ messages in thread
From: Stephen Hemminger @ 2004-07-02 19:50 UTC (permalink / raw)
To: Andrew Morton; +Cc: Herbert Xu, netdev
On Fri, 2 Jul 2004 12:30:22 -0700
Andrew Morton <akpm@osdl.org> wrote:
> Herbert Xu <herbert@gondor.apana.org.au> wrote:
> >
> > Andrew Morton <akpm@osdl.org> wrote:
> > >
> > > Could someone please take a look at the locking in
> > > net/ipv4/ipmr.c:ipmr_mfc_seq_next? It seems rather broken.
> >
> > Obfuscated yes, broken no.
>
> Really? Even if that goto
>
> if (it->cache == &mfc_unres_queue)
> goto end_of_list;
>
> is taken?
The problem is that the seq_file interface is trying to create an iterator
over a set of data. The interface expects that the usage will always fit
a given pattern.
The implied semantic is:
if returned handle is NULL, then nothing is locked and it->cache = NULL
if returned handle is in the mfc_cache_table then
it->cache points to mfc_cache_array and mrt_lock is held
if returned handle is in the mfc_unres_queue then
it->cache points to mfc_unres_queue and mfc_unres_lock is held
the seq_next is only valid to mean give me the next entry after the passed handle
therefore if there are no more entries after the handle and the handle was on
the unresolved queue, then the end is reached.
When seq_stop is called, the it->cache pointer will be NULL so no unlock is done.
^ permalink raw reply [flat|nested] 8+ messages in thread
* Re: [BUGS] [CHECKER] 99 synchronization bugs and a lock summary database
2004-07-02 19:50 ` Stephen Hemminger
@ 2004-07-03 5:42 ` OGAWA Hirofumi
2004-07-03 6:13 ` Herbert Xu
0 siblings, 1 reply; 8+ messages in thread
From: OGAWA Hirofumi @ 2004-07-03 5:42 UTC (permalink / raw)
To: Stephen Hemminger; +Cc: Andrew Morton, Herbert Xu, netdev
Stephen Hemminger <shemminger@osdl.org> writes:
> > > Andrew Morton <akpm@osdl.org> wrote:
> > > >
> > > > Could someone please take a look at the locking in
> > > > net/ipv4/ipmr.c:ipmr_mfc_seq_next? It seems rather broken.
> > >
> > > Obfuscated yes, broken no.
> >
> > Really? Even if that goto
> >
> > if (it->cache == &mfc_unres_queue)
> > goto end_of_list;
> >
> > is taken?
>
> The problem is that the seq_file interface is trying to create an iterator
> over a set of data. The interface expects that the usage will always fit
> a given pattern.
>
> The implied semantic is:
> if returned handle is NULL, then nothing is locked and it->cache = NULL
>
> if returned handle is in the mfc_cache_table then
> it->cache points to mfc_cache_array and mrt_lock is held
> if returned handle is in the mfc_unres_queue then
> it->cache points to mfc_unres_queue and mfc_unres_lock is held
>
> the seq_next is only valid to mean give me the next entry after the passed handle
> therefore if there are no more entries after the handle and the handle was on
> the unresolved queue, then the end is reached.
>
> When seq_stop is called, the it->cache pointer will be NULL so no unlock is done.
How about the following patch? At lease, this seems to need ipmr_mfc_release().
Untested patch, sorry.
--
OGAWA Hirofumi <hirofumi@mail.parknet.co.jp>
net/ipv4/ipmr.c | 50 ++++++++++++++++++++++++++------------------------
1 files changed, 26 insertions(+), 24 deletions(-)
diff -puN net/ipv4/ipmr.c~ipmr-cleanup net/ipv4/ipmr.c
--- linux-2.6.7/net/ipv4/ipmr.c~ipmr-cleanup 2004-07-03 14:13:50.000000000 +0900
+++ linux-2.6.7-hirofumi/net/ipv4/ipmr.c 2004-07-03 14:30:54.000000000 +0900
@@ -1710,7 +1710,6 @@ struct ipmr_mfc_iter {
int ct;
};
-
static struct mfc_cache *ipmr_mfc_seq_idx(struct ipmr_mfc_iter *it, loff_t pos)
{
struct mfc_cache *mfc;
@@ -1734,7 +1733,6 @@ static struct mfc_cache *ipmr_mfc_seq_id
return NULL;
}
-
static void *ipmr_mfc_seq_start(struct seq_file *seq, loff_t *pos)
{
return *pos ? ipmr_mfc_seq_idx(seq->private, *pos - 1)
@@ -1754,31 +1752,29 @@ static void *ipmr_mfc_seq_next(struct se
if (mfc->next)
return mfc->next;
- if (it->cache == &mfc_unres_queue)
- goto end_of_list;
+ if (it->cache == mfc_cache_array) {
+ while (++it->ct < MFC_LINES) {
+ mfc = mfc_cache_array[it->ct];
+ if (mfc)
+ return mfc;
+ }
- BUG_ON(it->cache != mfc_cache_array);
+ /*
+ * exhausted cache_array, show unresolved.
+ * So switch to mfc_unres_queue.
+ */
+ read_unlock(&mrt_lock);
- while (++it->ct < MFC_LINES) {
- mfc = mfc_cache_array[it->ct];
- if (mfc)
- return mfc;
+ it->cache = &mfc_unres_queue;
+ it->ct = 0;
+ spin_lock_bh(&mfc_unres_lock);
+ if (it->cache == mfc_unres_queue) {
+ mfc = mfc_unres_queue;
+ if (mfc)
+ return mfc;
+ }
}
- /* exhausted cache_array, show unresolved */
- read_unlock(&mrt_lock);
- it->cache = &mfc_unres_queue;
- it->ct = 0;
-
- spin_lock_bh(&mfc_unres_lock);
- mfc = mfc_unres_queue;
- if (mfc)
- return mfc;
-
- end_of_list:
- spin_unlock_bh(&mfc_unres_lock);
- it->cache = NULL;
-
return NULL;
}
@@ -1854,7 +1850,13 @@ out:
out_kfree:
kfree(s);
goto out;
+}
+static int ipmr_mfc_release(struct inode *inode, struct file *file)
+{
+ struct seq_file *seq = file->private_data;
+ kfree(seq->private);
+ return seq_release(inode, file);
}
static struct file_operations ipmr_mfc_fops = {
@@ -1862,7 +1864,7 @@ static struct file_operations ipmr_mfc_f
.open = ipmr_mfc_open,
.read = seq_read,
.llseek = seq_lseek,
- .release = seq_release,
+ .release = ipmr_mfc_release,
};
#endif
_
^ permalink raw reply [flat|nested] 8+ messages in thread
* Re: [BUGS] [CHECKER] 99 synchronization bugs and a lock summary database
2004-07-03 5:42 ` OGAWA Hirofumi
@ 2004-07-03 6:13 ` Herbert Xu
2004-07-03 8:03 ` OGAWA Hirofumi
0 siblings, 1 reply; 8+ messages in thread
From: Herbert Xu @ 2004-07-03 6:13 UTC (permalink / raw)
To: OGAWA Hirofumi; +Cc: Stephen Hemminger, Andrew Morton, netdev
On Sat, Jul 03, 2004 at 02:42:47PM +0900, OGAWA Hirofumi wrote:
>
> @@ -1854,7 +1850,13 @@ out:
> out_kfree:
> kfree(s);
> goto out;
> +}
>
> +static int ipmr_mfc_release(struct inode *inode, struct file *file)
> +{
> + struct seq_file *seq = file->private_data;
> + kfree(seq->private);
> + return seq_release(inode, file);
> }
>
> static struct file_operations ipmr_mfc_fops = {
> @@ -1862,7 +1864,7 @@ static struct file_operations ipmr_mfc_f
> .open = ipmr_mfc_open,
> .read = seq_read,
> .llseek = seq_lseek,
> - .release = seq_release,
> + .release = ipmr_mfc_release,
> };
> #endif
This is a good catch. But you probably need to fix the vif stuff as
well.
Cheers,
--
Visit Openswan at http://www.openswan.org/
Email: Herbert Xu ~{PmV>HI~} <herbert@gondor.apana.org.au>
Home Page: http://gondor.apana.org.au/~herbert/
PGP Key: http://gondor.apana.org.au/~herbert/pubkey.txt
^ permalink raw reply [flat|nested] 8+ messages in thread
* Re: [BUGS] [CHECKER] 99 synchronization bugs and a lock summary database
2004-07-03 6:13 ` Herbert Xu
@ 2004-07-03 8:03 ` OGAWA Hirofumi
0 siblings, 0 replies; 8+ messages in thread
From: OGAWA Hirofumi @ 2004-07-03 8:03 UTC (permalink / raw)
To: Herbert Xu; +Cc: Stephen Hemminger, Andrew Morton, netdev
Herbert Xu <herbert@gondor.apana.org.au> writes:
> > +static int ipmr_mfc_release(struct inode *inode, struct file *file)
> > +{
> > + struct seq_file *seq = file->private_data;
> > + kfree(seq->private);
> > + return seq_release(inode, file);
> > }
> >
> > static struct file_operations ipmr_mfc_fops = {
> > @@ -1862,7 +1864,7 @@ static struct file_operations ipmr_mfc_f
> > .open = ipmr_mfc_open,
> > .read = seq_read,
> > .llseek = seq_lseek,
> > - .release = seq_release,
> > + .release = ipmr_mfc_release,
> > };
> > #endif
>
> This is a good catch. But you probably need to fix the vif stuff as
> well.
Yes, right. I didn't notice it. Thanks.
- use seq_release_private for ipmr_vif/mfc_fops.
The anothor bug, it->cache need in *->seq_start* by NULL. Otherwise,
it->cache can be pointed the previous state by lseek().
e.g.
->seq_next()
it->cache = &mfc_unres_queue
->seq_stop()
lseek()
->seq_start()
return SEQ_START_TOKEN (it->cache still has &mfc_unres_queue)
->seq_show()
->seq_stop()
bang
Thanks.
--
OGAWA Hirofumi <hirofumi@mail.parknet.co.jp>
---
net/ipv4/ipmr.c | 45 ++++++++++++++++++++-------------------------
1 files changed, 20 insertions(+), 25 deletions(-)
diff -puN net/ipv4/ipmr.c~ipmr-cleanup net/ipv4/ipmr.c
--- linux-2.6.7/net/ipv4/ipmr.c~ipmr-cleanup 2004-07-03 14:13:50.000000000 +0900
+++ linux-2.6.7-hirofumi/net/ipv4/ipmr.c 2004-07-03 16:53:05.000000000 +0900
@@ -1702,7 +1702,7 @@ static struct file_operations ipmr_vif_f
.open = ipmr_vif_open,
.read = seq_read,
.llseek = seq_lseek,
- .release = seq_release,
+ .release = seq_release_private,
};
struct ipmr_mfc_iter {
@@ -1710,7 +1710,6 @@ struct ipmr_mfc_iter {
int ct;
};
-
static struct mfc_cache *ipmr_mfc_seq_idx(struct ipmr_mfc_iter *it, loff_t pos)
{
struct mfc_cache *mfc;
@@ -1734,9 +1733,11 @@ static struct mfc_cache *ipmr_mfc_seq_id
return NULL;
}
-
static void *ipmr_mfc_seq_start(struct seq_file *seq, loff_t *pos)
{
+ struct ipmr_mfc_iter *it = seq->private;
+ it->cache = NULL;
+ it->ct = 0;
return *pos ? ipmr_mfc_seq_idx(seq->private, *pos - 1)
: SEQ_START_TOKEN;
}
@@ -1754,31 +1755,27 @@ static void *ipmr_mfc_seq_next(struct se
if (mfc->next)
return mfc->next;
- if (it->cache == &mfc_unres_queue)
- goto end_of_list;
+ if (it->cache == mfc_cache_array) {
+ while (++it->ct < MFC_LINES) {
+ mfc = mfc_cache_array[it->ct];
+ if (mfc)
+ return mfc;
+ }
- BUG_ON(it->cache != mfc_cache_array);
+ /*
+ * exhausted cache_array, show unresolved.
+ * So switch to mfc_unres_queue.
+ */
+ read_unlock(&mrt_lock);
- while (++it->ct < MFC_LINES) {
- mfc = mfc_cache_array[it->ct];
+ it->cache = &mfc_unres_queue;
+ it->ct = 0;
+ spin_lock_bh(&mfc_unres_lock);
+ mfc = mfc_unres_queue;
if (mfc)
return mfc;
}
- /* exhausted cache_array, show unresolved */
- read_unlock(&mrt_lock);
- it->cache = &mfc_unres_queue;
- it->ct = 0;
-
- spin_lock_bh(&mfc_unres_lock);
- mfc = mfc_unres_queue;
- if (mfc)
- return mfc;
-
- end_of_list:
- spin_unlock_bh(&mfc_unres_lock);
- it->cache = NULL;
-
return NULL;
}
@@ -1846,7 +1843,6 @@ static int ipmr_mfc_open(struct inode *i
if (rc)
goto out_kfree;
- memset(s, 0, sizeof(*s));
seq = file->private_data;
seq->private = s;
out:
@@ -1854,7 +1850,6 @@ out:
out_kfree:
kfree(s);
goto out;
-
}
static struct file_operations ipmr_mfc_fops = {
@@ -1862,7 +1857,7 @@ static struct file_operations ipmr_mfc_f
.open = ipmr_mfc_open,
.read = seq_read,
.llseek = seq_lseek,
- .release = seq_release,
+ .release = seq_release_private,
};
#endif
_
^ permalink raw reply [flat|nested] 8+ messages in thread
end of thread, other threads:[~2004-07-03 8:03 UTC | newest]
Thread overview: 8+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2004-07-02 7:49 Fw: [BUGS] [CHECKER] 99 synchronization bugs and a lock summary database Andrew Morton
2004-07-02 11:04 ` Herbert Xu
2004-07-02 19:30 ` Andrew Morton
2004-07-02 19:50 ` Stephen Hemminger
2004-07-03 5:42 ` OGAWA Hirofumi
2004-07-03 6:13 ` Herbert Xu
2004-07-03 8:03 ` OGAWA Hirofumi
[not found] <Pine.LNX.4.44.0407011747040.4015-100000@kaki.stanford.edu>
2004-07-02 8:47 ` YOSHIFUJI Hideaki / 吉藤英明
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).