All of lore.kernel.org
 help / color / mirror / Atom feed
* race with i_flock?
@ 2002-07-17  2:27 Dave Hansen
  2002-07-17  6:32 ` Andrew Morton
  2002-07-17 11:55 ` Matthew Wilcox
  0 siblings, 2 replies; 5+ messages in thread
From: Dave Hansen @ 2002-07-17  2:27 UTC (permalink / raw)
  To: Matthew Wilcox; +Cc: linux-fsdevel

I've been getting some funny freezes while running Specweb.  I don't 
think this is the cause, but I did run into it.

Program received signal SIGSEGV, Segmentation fault.
0xc0147540 in may_open (nd=0xdaa69f78, acc_mode=6, flag=3)
     at /home/dave/oprofile/linux-2.5.25/include/linux/fs.h:1047

which is:
static inline int get_lease(struct inode *inode, unsigned int mode)
{
------->if (inode->i_flock && (inode->i_flock->fl_flags & FL_LEASE))
                 return __get_lease(inode, mode);
         return 0;
}

It appears that i_flock is NULL:

(gdb) print inode
$1 = (struct inode *) 0xe8d638f0
(gdb) print *inode
$2 = {...big snip
      i_flock = 0x0,
      ...}
(gdb) print &inode->i_flock
$3 = (struct file_lock **) 0xe8d6396c

But, there was a check for that just a second earlier.  Looks racy to 
me.  I noticed that there is use of i_sem in some of the calling 
functions, but not in get_lease()'s call sequence.  Is this a problem? 
    A quick grep for i_flock didn't show any obvious places where it 
was set back to NULL.

#0  0xc0147540 in may_open (nd=0xdaa69f78, acc_mode=6, flag=3)
     at /home/dave/oprofile/linux-2.5.25/include/linux/fs.h:1047
#1  0xc014786c in open_namei (pathname=0xe7292000 
"/mnt/sdc1/www/post.log", flag=3, mode=438,
     nd=0xdaa69f78) at namei.c:1294
#2  0xc013bcbb in filp_open (filename=0xe7292000 
"/mnt/sdc1/www/post.log", flags=2, mode=438)
     at open.c:616
#3  0xc013c0df in sys_open (filename=0x8130cd0 
"/mnt/sdc1/www/post.log", flags=2, mode=438)
     at open.c:806
#4  0xc0106fbb in syscall_call ()


-- 
Dave Hansen
haveblue@us.ibm.com


^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: race with i_flock?
  2002-07-17  2:27 race with i_flock? Dave Hansen
@ 2002-07-17  6:32 ` Andrew Morton
  2002-07-17 11:55 ` Matthew Wilcox
  1 sibling, 0 replies; 5+ messages in thread
From: Andrew Morton @ 2002-07-17  6:32 UTC (permalink / raw)
  To: Dave Hansen; +Cc: Matthew Wilcox, linux-fsdevel

Dave Hansen wrote:
> 
> I've been getting some funny freezes while running Specweb.  I don't
> think this is the cause, but I did run into it.
> 
> Program received signal SIGSEGV, Segmentation fault.
> 0xc0147540 in may_open (nd=0xdaa69f78, acc_mode=6, flag=3)
>      at /home/dave/oprofile/linux-2.5.25/include/linux/fs.h:1047
> 
> which is:
> static inline int get_lease(struct inode *inode, unsigned int mode)
> {
> ------->if (inode->i_flock && (inode->i_flock->fl_flags & FL_LEASE))
>                  return __get_lease(inode, mode);
>          return 0;
> }

I get that on about every 1000th reboot (ie: once a week ;))

I have a poke around in kgdb when it happens and everything
looks just fine.  So what must be happening is that some
other CPU has come in and fixed up the pointer before the
kgdb stub captured that CPU.

I'll grab a trace next time.

-

^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: race with i_flock?
  2002-07-17  2:27 race with i_flock? Dave Hansen
  2002-07-17  6:32 ` Andrew Morton
@ 2002-07-17 11:55 ` Matthew Wilcox
  2002-07-17 22:31   ` Dave Hansen
  1 sibling, 1 reply; 5+ messages in thread
From: Matthew Wilcox @ 2002-07-17 11:55 UTC (permalink / raw)
  To: Dave Hansen; +Cc: Matthew Wilcox, linux-fsdevel, Stephen Rothwell

On Tue, Jul 16, 2002 at 07:27:00PM -0700, Dave Hansen wrote:
> which is:
> static inline int get_lease(struct inode *inode, unsigned int mode)
> {
> ------->if (inode->i_flock && (inode->i_flock->fl_flags & FL_LEASE))
>                  return __get_lease(inode, mode);
>          return 0;
> }
> 
> It appears that i_flock is NULL:

Doh!  That's entirely possible.  open() could race with posix_lock_file and
remove the first element of the i_flock list between the two tests.  So...
let's change get_lease() to be:

static inline int get_lease(struct inode *inode, unsigned int mode)
{
	if (inode->i_flock)
		return __get_lease(inode, mode);
	return 0;
}

__get_lease in 2.5.x has sufficient checks in it already; 2.4 does not
and needs something like this:


 	lock_kernel();
 	flock = inode->i_flock;
+	if (!flock || (flock->fl_flags & FL_LEASE) == 0)
+		goto out;
 	if (flock->fl_type & F_INPROGRESS) {
                if ((mode & O_NONBLOCK)

This must be a 1-insn wide race.  I'm very impressed you managed to hit
it ;-)

-- 
Revolutions do not require corporate support.

^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: race with i_flock?
  2002-07-17 11:55 ` Matthew Wilcox
@ 2002-07-17 22:31   ` Dave Hansen
  2002-07-18  1:07     ` Matthew Wilcox
  0 siblings, 1 reply; 5+ messages in thread
From: Dave Hansen @ 2002-07-17 22:31 UTC (permalink / raw)
  To: Matthew Wilcox; +Cc: linux-fsdevel, Stephen Rothwell

[-- Attachment #1: Type: text/plain, Size: 1317 bytes --]

Matthew Wilcox wrote:
> On Tue, Jul 16, 2002 at 07:27:00PM -0700, Dave Hansen wrote:
>>which is:
>>static inline int get_lease(struct inode *inode, unsigned int mode)
>>{
>>------->if (inode->i_flock && (inode->i_flock->fl_flags & FL_LEASE))
>>                 return __get_lease(inode, mode);
>>         return 0;
>>}
>>
>>It appears that i_flock is NULL:
> 
> Doh!  That's entirely possible.  open() could race with posix_lock_file and
> remove the first element of the i_flock list between the two tests.  So...
> let's change get_lease() to be:
> 
> static inline int get_lease(struct inode *inode, unsigned int mode)
> {
> 	if (inode->i_flock)
> 		return __get_lease(inode, mode);
> 	return 0;
> }
> 
> __get_lease in 2.5.x has sufficient checks in it already; 2.4 does not
> and needs something like this:
> 
> 
>  	lock_kernel();
>  	flock = inode->i_flock;
> +	if (!flock || (flock->fl_flags & FL_LEASE) == 0)
> +		goto out;
>  	if (flock->fl_type & F_INPROGRESS) {
>                 if ((mode & O_NONBLOCK)
> 
> This must be a 1-insn wide race.  I'm very impressed you managed to hit
> it ;-)

You have no idea :)

How about this patch?  I can't believe that I'm spreading the BKL, but 
it is needed in this case.  And, you _are_ removing it from flocking 
in 2.5, right?
-- 
Dave Hansen
haveblue@us.ibm.com

[-- Attachment #2: i_flock-race_fix.2.5.25-0.patch --]
[-- Type: text/plain, Size: 854 bytes --]

--- linux-2.5.25-clean/fs/locks.c	Thu Jul 11 00:18:43 2002
+++ linux/fs/locks.c	Wed Jul 17 15:28:47 2002
@@ -1083,8 +1083,6 @@
 	alloc_err = lease_alloc(NULL, mode & FMODE_WRITE ? F_WRLCK : F_RDLCK,
 			&new_fl);
 
-	lock_kernel();
-
 	time_out_leases(inode);
 
 	flock = inode->i_flock;
@@ -1155,7 +1153,6 @@
 	}
 
 out:
-	unlock_kernel();
 	if (!alloc_err)
 		locks_free_lock(new_fl);
 	return error;
--- linux-2.5.25-clean/include/linux/fs.h	Thu Jul 11 00:18:47 2002
+++ linux/include/linux/fs.h	Wed Jul 17 15:28:37 2002
@@ -1044,9 +1044,12 @@
 
 static inline int get_lease(struct inode *inode, unsigned int mode)
 {
+	int ret = 0;
+	lock_kernel();
 	if (inode->i_flock && (inode->i_flock->fl_flags & FL_LEASE))
-		return __get_lease(inode, mode);
-	return 0;
+		ret = __get_lease(inode, mode);
+	unlock_kernel();
+	return ret;
 }
 
 /* fs/open.c */

^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: race with i_flock?
  2002-07-17 22:31   ` Dave Hansen
@ 2002-07-18  1:07     ` Matthew Wilcox
  0 siblings, 0 replies; 5+ messages in thread
From: Matthew Wilcox @ 2002-07-18  1:07 UTC (permalink / raw)
  To: Dave Hansen; +Cc: Matthew Wilcox, linux-fsdevel, Stephen Rothwell

On Wed, Jul 17, 2002 at 03:31:14PM -0700, Dave Hansen wrote:
> Matthew Wilcox wrote:
> > static inline int get_lease(struct inode *inode, unsigned int mode)
> > {
> > 	if (inode->i_flock)
> > 		return __get_lease(inode, mode);
> > 	return 0;
> > }
> > 
> > __get_lease in 2.5.x has sufficient checks in it already; 2.4 does not
> > and needs something like this:
> > 
> > 
> >  	lock_kernel();
> >  	flock = inode->i_flock;
> > +	if (!flock || (flock->fl_flags & FL_LEASE) == 0)
> > +		goto out;
> >  	if (flock->fl_type & F_INPROGRESS) {
> >                 if ((mode & O_NONBLOCK)
> > 
> > This must be a 1-insn wide race.  I'm very impressed you managed to hit
> > it ;-)
> 
> You have no idea :)
> 
> How about this patch?  I can't believe that I'm spreading the BKL, but 
> it is needed in this case.  And, you _are_ removing it from flocking 
> in 2.5, right?

Hang on, I don't see that it's required.  We're entitled to check the
condition when not holding a lock, since assignments are atomic.  Then
we go back to check the condition still exists while holding the lock.

-- 
Revolutions do not require corporate support.

^ permalink raw reply	[flat|nested] 5+ messages in thread

end of thread, other threads:[~2002-07-18  1:07 UTC | newest]

Thread overview: 5+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2002-07-17  2:27 race with i_flock? Dave Hansen
2002-07-17  6:32 ` Andrew Morton
2002-07-17 11:55 ` Matthew Wilcox
2002-07-17 22:31   ` Dave Hansen
2002-07-18  1:07     ` Matthew Wilcox

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.