public inbox for linux-kernel@vger.kernel.org
 help / color / mirror / Atom feed
* [2.4] VFS locking problem during concurrent link/unlink
@ 2003-01-16 11:00 Oleg Drokin
  2003-01-16 15:39 ` Chris Mason
  0 siblings, 1 reply; 9+ messages in thread
From: Oleg Drokin @ 2003-01-16 11:00 UTC (permalink / raw)
  To: linux-kernel; +Cc: eazgwmir, viro, nikita

Hello!

   Debugging reiserfs problem that can be demonstrated with script created by
   Zygo Blaxell, I started to wonder if the problem presented is indeed reiserfs
   fault and not VFS.
   Though the Zygo claims script only produces problems on reiserfs, I am trying
   it now myself on ext2 (which will take some time).

   Debugging shows that reiserfs_link is sometimes called for inodes whose
   i_nlink is zero (and all corresponding data is deleted already).
   So my current guess of what's going on is this:

   process 1                      process 2
   sys_unlink("a/b")		  sys_link("a/b", "c/d");
   down(inode of ("a"));	  down(inode of "c");
				  
   lock_kernel()
   reiserfs_unlink("a/b")
     decreases i_nlink of a/b to zero
     and removes name "b" from "a"
   unlock_kernel()
   d_delete("a/b")
(*)
				   lock_kernel()
				   reiserfs_link()
					at this point we do usual stuff,
					but inode's n_nlink is zero and
					file data already removed which
				        indicates reiserfs_delete_inode()
					was already called at (*)
				   unlock_kernel()

   So my question is "Is it really ok that sys_link/vfs_link does not
   take semaphore on parent dir of original path?", or should we
   actually put a workaround in reiserfs code to avoid such a situation?

Bye,
    Oleg
PS: Here's the script:
#!/bin/bash

# Create an empty filesystem:

mkreiserfs -f -f /dev/hdb2
mount /dev/hdb2 /data1 -t reiserfs

cd /data1

# Script used to control the load average.  Note that as written the loops
# below will keep spawning new processes, so we need some way to throttle
# them.  Change the '-lt 10' to another number to change the number
# of processes.

cat <<'LC' > loadcheck && chmod 755 loadcheck
#!/bin/sh
read av1 av5 av15 rest < /proc/loadavg
echo -n "Load Average: $av1 ... "
av1=${av1%.*}
if [ $av1 -lt 10 ]; then
        echo OK
        exit 0
else
        echo "Whoa, Nellie!"
        exit 1
fi
LC
# Create directories used by test
mkdir foo bar
mkdir foo/etc foo/usr foo/var

# Start up some rsyncs.  I use /etc, /usr, and /var because there's a
# good mixture of files with some hardlinks between them, and on a normal
# Linux system some of them change from time to time.

while sleep 1m; do
        ./loadcheck || continue;
        for x in usr etc var; do
                rsync -avxHS --delete /$x/. foo/$x/. &
        done;
done &
# Start up some cp -al's and rm -rf's.  Note there are two concurrent
# sets of 'cp's and two concurrent sets of 'rm's, and each of those
# has different instances of 'cp' and 'rm' running at different times.
for x in  1 2; do
        while sleep 1m; do
                ./loadcheck || continue;
                cp -al foo bar/`date +%s` &
        done &
        while sleep 1m; do
                ./loadcheck || continue;
                for x in bar/*; do
                        rm -rf $x;
                        sleep 1m;
                done &
        done &
done &




^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: [2.4] VFS locking problem during concurrent link/unlink
  2003-01-16 11:00 [2.4] VFS locking problem during concurrent link/unlink Oleg Drokin
@ 2003-01-16 15:39 ` Chris Mason
  2003-01-16 15:43   ` Oleg Drokin
  0 siblings, 1 reply; 9+ messages in thread
From: Chris Mason @ 2003-01-16 15:39 UTC (permalink / raw)
  To: Oleg Drokin; +Cc: linux-kernel, eazgwmir, viro, nikita

On Thu, 2003-01-16 at 06:00, Oleg Drokin wrote:
> Hello!
> 
>    Debugging reiserfs problem that can be demonstrated with script created by
>    Zygo Blaxell, I started to wonder if the problem presented is indeed reiserfs
>    fault and not VFS.
>    Though the Zygo claims script only produces problems on reiserfs, I am trying
>    it now myself on ext2 (which will take some time).
> 
>    Debugging shows that reiserfs_link is sometimes called for inodes whose
>    i_nlink is zero (and all corresponding data is deleted already).
>    So my current guess of what's going on is this:
> 

No, this is a reiserfs bug, since we schedule after doing link checks in
reiserfs_link and reiserfs_unlink.  I sent a patch to reiserfs dev a
while ago, I'll pull it out of the suse kernel and rediff against
2.4.20.

-chris



^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: [2.4] VFS locking problem during concurrent link/unlink
  2003-01-16 15:39 ` Chris Mason
@ 2003-01-16 15:43   ` Oleg Drokin
  2003-01-16 16:02     ` Chris Mason
  0 siblings, 1 reply; 9+ messages in thread
From: Oleg Drokin @ 2003-01-16 15:43 UTC (permalink / raw)
  To: Chris Mason; +Cc: linux-kernel, eazgwmir, viro, nikita

Hello!

On Thu, Jan 16, 2003 at 10:39:41AM -0500, Chris Mason wrote:
> >    Debugging reiserfs problem that can be demonstrated with script created by
> >    Zygo Blaxell, I started to wonder if the problem presented is indeed reiserfs
> >    fault and not VFS.
> >    Though the Zygo claims script only produces problems on reiserfs, I am trying
> >    it now myself on ext2 (which will take some time).
> > 
> >    Debugging shows that reiserfs_link is sometimes called for inodes whose
> >    i_nlink is zero (and all corresponding data is deleted already).
> >    So my current guess of what's going on is this:
> No, this is a reiserfs bug, since we schedule after doing link checks in
> reiserfs_link and reiserfs_unlink.  I sent a patch to reiserfs dev a
> while ago, I'll pull it out of the suse kernel and rediff against
> 2.4.20.

Yes we do.
But on the other hand I've put a check at the beginning of reiserfs_link
and I am still seeing these links on inodes with i_nlink == 0.

Bye,
    Oleg

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: [2.4] VFS locking problem during concurrent link/unlink
  2003-01-16 15:43   ` Oleg Drokin
@ 2003-01-16 16:02     ` Chris Mason
  2003-01-16 16:06       ` Nikita Danilov
  2003-01-16 16:07       ` Oleg Drokin
  0 siblings, 2 replies; 9+ messages in thread
From: Chris Mason @ 2003-01-16 16:02 UTC (permalink / raw)
  To: Oleg Drokin; +Cc: linux-kernel, eazgwmir, viro, nikita

On Thu, 2003-01-16 at 10:43, Oleg Drokin wrote:
> Hello!
> 
> On Thu, Jan 16, 2003 at 10:39:41AM -0500, Chris Mason wrote:
> > >    Debugging reiserfs problem that can be demonstrated with script created by
> > >    Zygo Blaxell, I started to wonder if the problem presented is indeed reiserfs
> > >    fault and not VFS.
> > >    Though the Zygo claims script only produces problems on reiserfs, I am trying
> > >    it now myself on ext2 (which will take some time).
> > > 
> > >    Debugging shows that reiserfs_link is sometimes called for inodes whose
> > >    i_nlink is zero (and all corresponding data is deleted already).
> > >    So my current guess of what's going on is this:
> > No, this is a reiserfs bug, since we schedule after doing link checks in
> > reiserfs_link and reiserfs_unlink.  I sent a patch to reiserfs dev a
> > while ago, I'll pull it out of the suse kernel and rediff against
> > 2.4.20.
> 
> Yes we do.
> But on the other hand I've put a check at the beginning of reiserfs_link
> and I am still seeing these links on inodes with i_nlink == 0.

That's because we don't inc the link count in reiserfs_link before we
schedule.  The bug works a little like this:

link count at 1
reiserfs_link: make new directory entry for link, schedule()
reiserfs_unlink: dec link count to zero, remove file stat data
reiserfs_link: inc link count, return thinking the stat data is still
there

All of which leads to expanding chaos as we process this link pointing
to nowhere but still have a valid in ram inode pointing to it.

-chris



^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: [2.4] VFS locking problem during concurrent link/unlink
  2003-01-16 16:02     ` Chris Mason
@ 2003-01-16 16:06       ` Nikita Danilov
  2003-01-16 16:22         ` Chris Mason
  2003-01-16 16:07       ` Oleg Drokin
  1 sibling, 1 reply; 9+ messages in thread
From: Nikita Danilov @ 2003-01-16 16:06 UTC (permalink / raw)
  To: Chris Mason; +Cc: Oleg Drokin, linux-kernel, eazgwmir, viro

Chris Mason writes:
 > On Thu, 2003-01-16 at 10:43, Oleg Drokin wrote:

[...]

 > 
 > link count at 1
 > reiserfs_link: make new directory entry for link, schedule()
 > reiserfs_unlink: dec link count to zero, remove file stat data
 > reiserfs_link: inc link count, return thinking the stat data is still
 > there

What protects ext2 from doing the same on the SMP?

 > 
 > All of which leads to expanding chaos as we process this link pointing
 > to nowhere but still have a valid in ram inode pointing to it.
 > 
 > -chris

Nikita.

 > 
 > 

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: [2.4] VFS locking problem during concurrent link/unlink
  2003-01-16 16:02     ` Chris Mason
  2003-01-16 16:06       ` Nikita Danilov
@ 2003-01-16 16:07       ` Oleg Drokin
  1 sibling, 0 replies; 9+ messages in thread
From: Oleg Drokin @ 2003-01-16 16:07 UTC (permalink / raw)
  To: Chris Mason; +Cc: linux-kernel, eazgwmir, viro, nikita

Hello!

On Thu, Jan 16, 2003 at 11:02:08AM -0500, Chris Mason wrote:
> > Yes we do.
> > But on the other hand I've put a check at the beginning of reiserfs_link
> > and I am still seeing these links on inodes with i_nlink == 0.
> That's because we don't inc the link count in reiserfs_link before we
> schedule.  The bug works a little like this:
> link count at 1
> reiserfs_link: make new directory entry for link, schedule()
> reiserfs_unlink: dec link count to zero, remove file stat data
> reiserfs_link: inc link count, return thinking the stat data is still
> there

That was my original yesterday's assumption.
But debug prints I put in place did not confirmed it.
Also if we are having a dentry pinned in memory (by sys_link), inode should not
be deleted, so the statdata should stay inplace as Nikita argues.

Bye,
    Oleg

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: [2.4] VFS locking problem during concurrent link/unlink
  2003-01-16 16:06       ` Nikita Danilov
@ 2003-01-16 16:22         ` Chris Mason
  2003-01-16 17:03           ` Oleg Drokin
  0 siblings, 1 reply; 9+ messages in thread
From: Chris Mason @ 2003-01-16 16:22 UTC (permalink / raw)
  To: Nikita Danilov; +Cc: Oleg Drokin, linux-kernel, eazgwmir, viro

On Thu, 2003-01-16 at 11:06, Nikita Danilov wrote:
> Chris Mason writes:
>  > On Thu, 2003-01-16 at 10:43, Oleg Drokin wrote:
> 
> [...]
> 
>  > 
>  > link count at 1
>  > reiserfs_link: make new directory entry for link, schedule()
>  > reiserfs_unlink: dec link count to zero, remove file stat data
>  > reiserfs_link: inc link count, return thinking the stat data is still
>  > there
> 
> What protects ext2 from doing the same on the SMP?
> 

They inc the link count in ext2_link before scheduling.  The patch below
is what I had in mind, but is untested.

-chris

--- 1.24/fs/reiserfs/namei.c	Fri Aug  9 11:22:33 2002
+++ edited/fs/reiserfs/namei.c	Thu Jan 16 11:20:11 2003
@@ -748,6 +748,7 @@
     int windex ;
     struct reiserfs_transaction_handle th ;
     int jbegin_count;
+    unsigned long savelink;
 
     inode = dentry->d_inode;
 
@@ -783,11 +784,20 @@
 	inode->i_nlink = 1;
     }
 
+    inode->i_nlink--;
+
+    /* 
+     * we schedule before doing the add_save_link call, save the link
+     * count so we don't race
+     */
+    savelink = inode->i_nlink;
+
+
     retval = reiserfs_cut_from_item (&th, &path, &(de.de_entry_key), dir, NULL, 0);
-    if (retval < 0)
+    if (retval < 0) {
+	inode->i_nlink++;
 	goto end_unlink;
-
-    inode->i_nlink--;
+    }
     inode->i_ctime = CURRENT_TIME;
     reiserfs_update_sd (&th, inode);
 
@@ -796,7 +806,7 @@
     dir->i_ctime = dir->i_mtime = CURRENT_TIME;
     reiserfs_update_sd (&th, dir);
 
-    if (!inode->i_nlink)
+    if (!savelink)
        /* prevent file from getting lost */
        add_save_link (&th, inode, 0/* not truncate */);
 
@@ -900,6 +910,12 @@
 	//FIXME: sd_nlink is 32 bit for new files
 	return -EMLINK;
     }
+    if (inode->i_nlink == 0) {
+        return -ENOENT;
+    }
+
+    /* inc before scheduling so reiserfs_unlink knows we are here */
+    inode->i_nlink++;
 
     journal_begin(&th, dir->i_sb, jbegin_count) ;
     windex = push_journal_writer("reiserfs_link") ;
@@ -912,12 +928,12 @@
     reiserfs_update_inode_transaction(dir) ;
 
     if (retval) {
+        inode->i_nlink--;
 	pop_journal_writer(windex) ;
 	journal_end(&th, dir->i_sb, jbegin_count) ;
 	return retval;
     }
 
-    inode->i_nlink++;
     ctime = CURRENT_TIME;
     inode->i_ctime = ctime;
     reiserfs_update_sd (&th, inode);
@@ -992,6 +1008,7 @@
     int jbegin_count ; 
     umode_t old_inode_mode;
     time_t ctime;
+    unsigned long savelink = 1;
 
 
     /* two balancings: old name removal, new name insertion or "save" link,
@@ -1161,6 +1178,7 @@
 	} else {
 	    new_dentry_inode->i_nlink--;
 	}
+	savelink = new_dentry_inode->i_nlink;
 	ctime = CURRENT_TIME;
 	new_dentry_inode->i_ctime = ctime;
     }
@@ -1196,7 +1214,7 @@
     reiserfs_update_sd (&th, new_dir);
 
     if (new_dentry_inode) {
-	if (new_dentry_inode->i_nlink == 0)
+	if (savelink == 0)
 	    add_save_link (&th, new_dentry_inode, 0/* not truncate */);
 	reiserfs_update_sd (&th, new_dentry_inode);
     }




^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: [2.4] VFS locking problem during concurrent link/unlink
  2003-01-16 16:22         ` Chris Mason
@ 2003-01-16 17:03           ` Oleg Drokin
  2003-01-16 17:20             ` Oleg Drokin
  0 siblings, 1 reply; 9+ messages in thread
From: Oleg Drokin @ 2003-01-16 17:03 UTC (permalink / raw)
  To: Chris Mason; +Cc: Nikita Danilov, linux-kernel, eazgwmir, viro

Hello!

On Thu, Jan 16, 2003 at 11:22:32AM -0500, Chris Mason wrote:

> They inc the link count in ext2_link before scheduling.  The patch below
> is what I had in mind, but is untested.

It does not change anything.

Bye,
    Oleg

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: [2.4] VFS locking problem during concurrent link/unlink
  2003-01-16 17:03           ` Oleg Drokin
@ 2003-01-16 17:20             ` Oleg Drokin
  0 siblings, 0 replies; 9+ messages in thread
From: Oleg Drokin @ 2003-01-16 17:20 UTC (permalink / raw)
  To: Chris Mason; +Cc: Nikita Danilov, linux-kernel, eazgwmir, viro

Hello!

On Thu, Jan 16, 2003 at 08:03:41PM +0300, Oleg Drokin wrote:
> > They inc the link count in ext2_link before scheduling.  The patch below
> > is what I had in mind, but is untested.
> It does not change anything.

Err, I mean I still can reproduce the problem with this patch.

Bye,
    Oleg

^ permalink raw reply	[flat|nested] 9+ messages in thread

end of thread, other threads:[~2003-01-16 17:11 UTC | newest]

Thread overview: 9+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2003-01-16 11:00 [2.4] VFS locking problem during concurrent link/unlink Oleg Drokin
2003-01-16 15:39 ` Chris Mason
2003-01-16 15:43   ` Oleg Drokin
2003-01-16 16:02     ` Chris Mason
2003-01-16 16:06       ` Nikita Danilov
2003-01-16 16:22         ` Chris Mason
2003-01-16 17:03           ` Oleg Drokin
2003-01-16 17:20             ` Oleg Drokin
2003-01-16 16:07       ` Oleg Drokin

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox