All of lore.kernel.org
 help / color / mirror / Atom feed
* Stress test failure for reiserfs, but ext3 ok on Linux 2.4.20
@ 2003-01-15  2:35 Bernhard Sadlowski
  2003-01-15 10:43 ` Oleg Drokin
  0 siblings, 1 reply; 7+ messages in thread
From: Bernhard Sadlowski @ 2003-01-15  2:35 UTC (permalink / raw)
  To: reiserfs-list

[-- Attachment #1: Type: text/plain, Size: 5991 bytes --]

I am using the attached stess.sh script (probably from this mailinglist)
for creating load on a reiserfs filesystem, which forks 100
(read,write,delete) processes: 

# mkreiserfs /dev/sda4
# mount /dev/sda4 /backup
# stress.sh -c /usr -n 100 /backup 

Then wait until /backup fills up.

# mount |grep sda
/dev/sda3 on / type ext3 (rw,errors=remount-ro)
/dev/sda2 on /boot type ext3 (rw,errors=remount-ro)
/dev/sda4 on /backup type reiserfs (rw)

# df -k / /boot /backup
Filesystem           1k-blocks      Used Available Use% Mounted on
/dev/sda3              4134932   3355468    569416  86% /
/dev/sda2               132221     72896     52498  59% /boot
/dev/sda4             12530312  12530312         0 100% /backup

# gcc -v
Reading specs from /usr/lib/gcc-lib/i386-linux/2.95.4/specs
gcc version 2.95.4 20011002 (Debian prerelease)

# uname -a
Linux dematl06 2.4.20-p4-debug #1 SMP Wed Jan 15 00:48:49 CET 2003 i686 unknown

The Machine is Debian woody box with 2 x 2.40 GHz Intel Xeon (4 virtual
CPUs). 

# uptime
 03:04:18 up  1:02,  1 user,  load average: 100.99, 100.83, 89.56

This happens on reiserfs:

# iostat 1
avg-cpu:  %user   %nice    %sys   %idle
           0.00    0.00    0.00  100.00

Device:            tps   Blk_read/s   Blk_wrtn/s   Blk_read   Blk_wrtn
dev8-0            0.00         0.00         0.00          0          0
dev8-1            0.00         0.00         0.00          0          0

# vmstat 1
   procs                      memory    swap          io     system       cpu
 r  b  w   swpd   free   buff  cache  si  so    bi    bo   in    cs   us  sy  id
 0 100  1   4932   8404 117916 1559292   0   1   206   950   81   211   0   6  94
 0 100  1   4932   8404 117916 1559292   0   0     0     0  107     6   0   0 100
 0 100  1   4932   8404 117916 1559292   0   0     0     0  106    10   0   0 100
 0 100  1   4932   8404 117916 1559292   0   0     0     0  105     6   0   0 10
...

Any I/O freezes and even after killing the script, the remaining cp and
mv commands don't terminate. They are in status "D". A simle "ls
/backup" never comes back. Only a hard powerdown fixes this situation,
because "init 6" etc. doesn't work. I have even activated the reiserfs
debug, but I don't see any additional info.

# ls -l /backup &
[2] 1432

# ps ax|grep ls
 1432 pts/0    D      0:00 ls -l /backup
 1445 pts/0    S      0:00 grep ls
 
# kill -9 1432

# ps ax|grep ls
 1432 pts/0    D      0:00 ls -l /backup
 1450 pts/0    S      0:00 grep ls

If I use the jfs Filesystem, the same thing happens.

If I use ext3, everything is fine, and the cp/mv/rm processes proceed
even with full filesystem /backup and load = 100. iostat and vmstat
still show I/O on filesystem /backup.

The Kernel is a Standard 2.4.20 Kernel only with one additional patch
for the SK9Dxx Gigabit Ethernet Adapter from Syskonnect. During the
Tests this module (sk9dlin) is not loaded.

I used several kernels.... 2.4.19, 2.4.20, 2.4.20-ac2, 2.4.21-pre3. All
with same results.

Is something wrong? What is wrong? 

# lsmod
Module                  Size  Used by    Not tainted
reiserfs              250656   1  (autoclean)
nfs                    67676   1  (autoclean)
lockd                  47680   1  (autoclean) [nfs]
sunrpc                 64692   1  (autoclean) [nfs lockd]
eepro100               18380   1  (autoclean)
mii                     2400   0  (autoclean) [eepro100]
md                     57344   0  (autoclean)
lvm-mod                59360   0 
rtc                     6492   0  (autoclean)
unix                   15716  62  (autoclean)

# dmesg | tail -9
reiserfs:warning: CONFIG_REISERFS_CHECK is set ON
reiserfs:warning: - it is slow mode for debugging.
reiserfs: checking transaction log (device 08:04) ...
journal-1153: found in header: first_unflushed_offset 6,
last_flushed_trans_id 11
journal-1206: Starting replay from offset 6, trans_id 12
journal-1299: Setting newest_mount_id to 11
Using r5 hash to sort names
ReiserFS version 3.6.25
vs-8301: reiserfs_kmalloc: allocated memory 200904

# lspci
00:00.0 Host bridge: Intel Corp. e7500 [Plumas] DRAM Controller (rev 03)
00:00.1 Class ff00: Intel Corp. e7500 [Plumas] DRAM Controller Error Reporting (rev 03)
00:02.0 PCI bridge: Intel Corp. e7500 [Plumas] HI_B Virtual PCI Bridge (F0) (rev 03)
00:1d.0 USB Controller: Intel Corp. 82801CA/CAM USB (Hub  (rev 02)
00:1d.1 USB Controller: Intel Corp. 82801CA/CAM USB (Hub  (rev 02)
00:1d.2 USB Controller: Intel Corp. 82801CA/CAM USB (Hub  (rev 02)
00:1e.0 PCI bridge: Intel Corp. 82801BA/CA/DB PCI Bridge (rev 42)
00:1f.0 ISA bridge: Intel Corp. 82801CA ISA Bridge (LPC) (rev 02)
00:1f.1 IDE interface: Intel Corp. 82801CA IDE U100 (rev 02)
00:1f.3 SMBus: Intel Corp. 82801CA/CAM SMBus (rev 02)
01:1c.0 PIC: Intel Corp. 82870P2 P64H2 I/OxAPIC (rev 03)
01:1d.0 PCI bridge: Intel Corp. 82870P2 P64H2 Hub PCI Bridge (rev 03)
01:1e.0 PIC: Intel Corp. 82870P2 P64H2 I/OxAPIC (rev 03)
01:1f.0 PCI bridge: Intel Corp. 82870P2 P64H2 Hub PCI Bridge (rev 03)
03:01.0 Ethernet controller: Syskonnect (Schneider & Koch) Gigabit Ethernet (rev 10)
03:02.0 SCSI storage controller: Adaptec AIC-7899P U160/m (rev 01)
03:02.1 SCSI storage controller: Adaptec AIC-7899P U160/m (rev 01)
03:04.0 Ethernet controller: Intel Corp. 82544GC Gigabit Ethernet Controller (LOM) (rev 02)
04:01.0 VGA compatible controller: ATI Technologies Inc Rage XL (rev 27)
04:02.0 Ethernet controller: Intel Corp. 82557/8/9 [Ethernet Pro 100] (rev 0d)

# cat /proc/cpuinfo 
processor       : 0
vendor_id       : GenuineIntel
cpu family      : 15
model           : 2
model name      : Intel(R) Xeon(TM) CPU 2.40GHz
stepping        : 7
cpu MHz         : 2395.954
cache size      : 512 KB
fdiv_bug        : no
hlt_bug         : no
f00f_bug        : no
coma_bug        : no
fpu             : yes
fpu_exception   : yes
cpuid level     : 2
wp              : yes
flags           : fpu vme de pse tsc msr pae mce cx8 apic sep mtrr pge mca cmov 
bogomips        : 4784.12

same for processor 1, 2, 3...

Bernhard

[-- Attachment #2: stress.sh --]
[-- Type: application/x-sh, Size: 3276 bytes --]

^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: Stress test failure for reiserfs, but ext3 ok on Linux 2.4.20
  2003-01-15  2:35 Stress test failure for reiserfs, but ext3 ok on Linux 2.4.20 Bernhard Sadlowski
@ 2003-01-15 10:43 ` Oleg Drokin
  2003-01-15 11:51   ` Bernhard Sadlowski
  0 siblings, 1 reply; 7+ messages in thread
From: Oleg Drokin @ 2003-01-15 10:43 UTC (permalink / raw)
  To: Bernhard Sadlowski; +Cc: reiserfs-list

Hello!

On Wed, Jan 15, 2003 at 03:35:59AM +0100, Bernhard Sadlowski wrote:

> I am using the attached stess.sh script (probably from this mailinglist)
> for creating load on a reiserfs filesystem, which forks 100
> (read,write,delete) processes: 
> # mkreiserfs /dev/sda4
> # mount /dev/sda4 /backup
> # stress.sh -c /usr -n 100 /backup 
> Then wait until /backup fills up.

Hm. This resembles me something.
Can you reproduce the same problem if you apply patches from
ftp://ftp.namesys.com/pub/reiserfs-for-2.4/testing/quota-2.4.20/
These patches add quota support to reiserfs, but also change some
new inode-related operation to prevent deadlocks like you are seeing.

> Any I/O freezes and even after killing the script, the remaining cp and
> mv commands don't terminate. They are in status "D". A simle "ls
> /backup" never comes back. Only a hard powerdown fixes this situation,
> because "init 6" etc. doesn't work. I have even activated the reiserfs
> debug, but I don't see any additional info.

Try executing sysrq-t after the lockup happens, then send us decoded output
plese.

Thank you.

Bye,
    Oleg

^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: Stress test failure for reiserfs, but ext3 ok on Linux 2.4.20
  2003-01-15 10:43 ` Oleg Drokin
@ 2003-01-15 11:51   ` Bernhard Sadlowski
  2003-01-15 11:58     ` Oleg Drokin
  0 siblings, 1 reply; 7+ messages in thread
From: Bernhard Sadlowski @ 2003-01-15 11:51 UTC (permalink / raw)
  To: Oleg Drokin; +Cc: reiserfs-list

On 15 Jan 2003 13:43, Oleg Drokin <green@namesys.com> wrote:
> Hm. This resembles me something.
> Can you reproduce the same problem if you apply patches from
> ftp://ftp.namesys.com/pub/reiserfs-for-2.4/testing/quota-2.4.20/
> These patches add quota support to reiserfs, but also change some
> new inode-related operation to prevent deadlocks like you are seeing.

The unpatched kernel shows the hangs much earlier, so I assume that the
above patches solve the problem. With the patches the load goes up very
slowly but steady to "100" and I/O does not freeze anymore. vmstat and
iostat still show activity. I assume you don't need any sysrq-t output
now.

Will the patches be included in 2.4.21?

Bernhard

^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: Stress test failure for reiserfs, but ext3 ok on Linux 2.4.20
  2003-01-15 11:51   ` Bernhard Sadlowski
@ 2003-01-15 11:58     ` Oleg Drokin
  2003-01-15 15:01       ` Oleg Drokin
  0 siblings, 1 reply; 7+ messages in thread
From: Oleg Drokin @ 2003-01-15 11:58 UTC (permalink / raw)
  To: Bernhard Sadlowski; +Cc: reiserfs-list

Hello!

On Wed, Jan 15, 2003 at 12:51:00PM +0100, Bernhard Sadlowski wrote:

> > Hm. This resembles me something.
> > Can you reproduce the same problem if you apply patches from
> > ftp://ftp.namesys.com/pub/reiserfs-for-2.4/testing/quota-2.4.20/
> > These patches add quota support to reiserfs, but also change some
> > new inode-related operation to prevent deadlocks like you are seeing.
> The unpatched kernel shows the hangs much earlier, so I assume that the
> above patches solve the problem. With the patches the load goes up very
> slowly but steady to "100" and I/O does not freeze anymore. vmstat and
> iostat still show activity. I assume you don't need any sysrq-t output
> now.

Ok. That's a good sign.

> Will the patches be included in 2.4.21?

No, they require quota support tha won't be included into 2.4 because of
new quota formats and stuff.
I will extract relevant bits from the patch though.
I will send you short version without quota once it will be ready.

Thank you.

Bye,
    Oleg

^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: Stress test failure for reiserfs, but ext3 ok on Linux 2.4.20
  2003-01-15 11:58     ` Oleg Drokin
@ 2003-01-15 15:01       ` Oleg Drokin
  2003-01-15 15:48         ` Bernhard Sadlowski
  0 siblings, 1 reply; 7+ messages in thread
From: Oleg Drokin @ 2003-01-15 15:01 UTC (permalink / raw)
  To: Bernhard Sadlowski; +Cc: reiserfs-list

Hello!

On Wed, Jan 15, 2003 at 02:58:04PM +0300, Oleg Drokin wrote:

> I will extract relevant bits from the patch though.
> I will send you short version without quota once it will be ready.

Ok, here is the patch, can you give it a try and see if it also helps?
I tested it locally and it works for me.
If you confirm everything is ok, I will try to get it into 2.4.21 in time.

Bye,
    Oleg

--- linux-2.4.20/fs/reiserfs/namei.c	Fri Nov 29 02:53:15 2002
+++ linux-2.4.20-t/fs/reiserfs/namei.c	Wed Jan 15 17:08:20 2003
@@ -488,27 +488,58 @@
     return 0;
 }
 
+/* quota utility function, call if you've had to abort after calling
+** new_inode_init, and have not called reiserfs_new_inode yet.
+** This should only be called on inodes that do not hav stat data
+** inserted into the tree yet.
+*/
+static int drop_new_inode(struct inode *inode) {
+    make_bad_inode(inode) ;
+    iput(inode) ;
+    return 0 ;
+}
+
+/* utility function that does setup for reiserfs_new_inode.  
+** DQUOT_ALLOC_INODE cannot be called inside a transaction, so we had
+** to pull some bits of reiserfs_new_inode out into this func.
+*/
+static int new_inode_init(struct inode *inode, struct inode *dir, int mode) {
+
+    /* the quota init calls have to know who to charge the quota to, so
+    ** we have to set uid and gid here
+    */
+    inode->i_uid = current->fsuid;
+    inode->i_mode = mode;
+
+    if (dir->i_mode & S_ISGID) {
+        inode->i_gid = dir->i_gid;
+        if (S_ISDIR(mode))
+            inode->i_mode |= S_ISGID;
+    } else
+        inode->i_gid = current->fsgid;
 
+    return 0 ;
+}
+  
 static int reiserfs_create (struct inode * dir, struct dentry *dentry, int mode)
 {
     int retval;
     struct inode * inode;
-    int windex ;
     int jbegin_count = JOURNAL_PER_BALANCE_CNT * 2 ;
     struct reiserfs_transaction_handle th ;
 
-
     if (!(inode = new_inode(dir->i_sb))) {
 	return -ENOMEM ;
     }
+    retval = new_inode_init(inode, dir, mode) ;
+    if (retval)
+	return retval ;
+
     journal_begin(&th, dir->i_sb, jbegin_count) ;
     th.t_caller = "create" ;
-    windex = push_journal_writer("reiserfs_create") ;
-    inode = reiserfs_new_inode (&th, dir, mode, 0, 0/*i_size*/, dentry, inode, &retval);
-    if (!inode) {
-	pop_journal_writer(windex) ;
-	journal_end(&th, dir->i_sb, jbegin_count) ;
-	return retval;
+    retval = reiserfs_new_inode (&th, dir, mode, 0, 0/*i_size*/, dentry, inode);
+    if (retval) {
+	goto out_failed ;
     }
 	
     inode->i_op = &reiserfs_file_inode_operations;
@@ -520,20 +551,19 @@
     if (retval) {
 	inode->i_nlink--;
 	reiserfs_update_sd (&th, inode);
-	pop_journal_writer(windex) ;
-	// FIXME: should we put iput here and have stat data deleted
-	// in the same transactioin
 	journal_end(&th, dir->i_sb, jbegin_count) ;
-	iput (inode);
-	return retval;
+	iput(inode) ;
+	goto out_failed ;
     }
     reiserfs_update_inode_transaction(inode) ;
     reiserfs_update_inode_transaction(dir) ;
 
     d_instantiate(dentry, inode);
-    pop_journal_writer(windex) ;
     journal_end(&th, dir->i_sb, jbegin_count) ;
     return 0;
+
+out_failed:
+    return retval ;
 }
 
 
@@ -541,21 +571,21 @@
 {
     int retval;
     struct inode * inode;
-    int windex ;
     struct reiserfs_transaction_handle th ;
     int jbegin_count = JOURNAL_PER_BALANCE_CNT * 3; 
 
     if (!(inode = new_inode(dir->i_sb))) {
 	return -ENOMEM ;
     }
+    retval = new_inode_init(inode, dir, mode) ;
+    if (retval)
+        return retval ;
+
     journal_begin(&th, dir->i_sb, jbegin_count) ;
-    windex = push_journal_writer("reiserfs_mknod") ;
 
-    inode = reiserfs_new_inode (&th, dir, mode, 0, 0/*i_size*/, dentry, inode, &retval);
-    if (!inode) {
-	pop_journal_writer(windex) ;
-	journal_end(&th, dir->i_sb, jbegin_count) ;
-	return retval;
+    retval = reiserfs_new_inode(&th, dir, mode, 0, 0/*i_size*/, dentry, inode);
+    if (retval) {
+	goto out_failed; 
     }
 
     init_special_inode(inode, mode, rdev) ;
@@ -571,16 +601,17 @@
     if (retval) {
 	inode->i_nlink--;
 	reiserfs_update_sd (&th, inode);
-	pop_journal_writer(windex) ;
 	journal_end(&th, dir->i_sb, jbegin_count) ;
-	iput (inode);
-	return retval;
+	iput(inode) ;
+        goto out_failed; 
     }
 
     d_instantiate(dentry, inode);
-    pop_journal_writer(windex) ;
     journal_end(&th, dir->i_sb, jbegin_count) ;
     return 0;
+
+out_failed:
+    return retval ;
 }
 
 
@@ -588,15 +619,18 @@
 {
     int retval;
     struct inode * inode;
-    int windex ;
     struct reiserfs_transaction_handle th ;
     int jbegin_count = JOURNAL_PER_BALANCE_CNT * 3; 
 
+    mode = S_IFDIR | mode;
     if (!(inode = new_inode(dir->i_sb))) {
 	return -ENOMEM ;
     }
+    retval = new_inode_init(inode, dir, mode) ;
+    if (retval)
+	return retval ;
+
     journal_begin(&th, dir->i_sb, jbegin_count) ;
-    windex = push_journal_writer("reiserfs_mkdir") ;
 
     /* inc the link count now, so another writer doesn't overflow it while
     ** we sleep later on.
@@ -607,15 +641,13 @@
     /* set flag that new packing locality created and new blocks for the content     * of that directory are not displaced yet */
     dir->u.reiserfs_i.new_packing_locality = 1;
 #endif
-    mode = S_IFDIR | mode;
-    inode = reiserfs_new_inode (&th, dir, mode, 0/*symlink*/,
-				old_format_only (dir->i_sb) ? EMPTY_DIR_SIZE_V1 : EMPTY_DIR_SIZE,
-				dentry, inode, &retval);
-    if (!inode) {
-	pop_journal_writer(windex) ;
+    retval = reiserfs_new_inode(&th, dir, mode, 0/*symlink*/,
+				old_format_only (dir->i_sb) ?
+				EMPTY_DIR_SIZE_V1 : EMPTY_DIR_SIZE,
+				dentry, inode) ;
+    if (retval) {
 	dir->i_nlink-- ;
-	journal_end(&th, dir->i_sb, jbegin_count) ;
-	return retval;
+	goto out_failed ;
     }
     reiserfs_update_inode_transaction(inode) ;
     reiserfs_update_inode_transaction(dir) ;
@@ -630,19 +662,20 @@
 	inode->i_nlink = 0;
 	DEC_DIR_INODE_NLINK(dir);
 	reiserfs_update_sd (&th, inode);
-	pop_journal_writer(windex) ;
 	journal_end(&th, dir->i_sb, jbegin_count) ;
-	iput (inode);
-	return retval;
+	iput(inode) ;
+	goto out_failed ;
     }
 
     // the above add_entry did not update dir's stat data
     reiserfs_update_sd (&th, dir);
 
     d_instantiate(dentry, inode);
-    pop_journal_writer(windex) ;
     journal_end(&th, dir->i_sb, jbegin_count) ;
     return 0;
+
+out_failed:
+    return retval ;
 }
 
 static inline int reiserfs_empty_dir(struct inode *inode) {
@@ -820,7 +853,7 @@
     struct inode * inode;
     char * name;
     int item_len;
-    int windex ;
+    int mode = S_IFLNK | S_IRWXUGO ;
     struct reiserfs_transaction_handle th ;
     int jbegin_count = JOURNAL_PER_BALANCE_CNT * 3; 
 
@@ -828,31 +861,34 @@
     if (!(inode = new_inode(parent_dir->i_sb))) {
   	return -ENOMEM ;
     }
+    retval = new_inode_init(inode, parent_dir, mode) ;
+    if (retval) {
+	return retval ;
+    }
 
     item_len = ROUND_UP (strlen (symname));
     if (item_len > MAX_DIRECT_ITEM_LEN (parent_dir->i_sb->s_blocksize)) {
-	iput(inode) ;
-	return -ENAMETOOLONG;
+	retval = -ENAMETOOLONG;
+	drop_new_inode(inode) ;
+	goto out_failed ;
     }
   
     name = reiserfs_kmalloc (item_len, GFP_NOFS, parent_dir->i_sb);
     if (!name) {
-	iput(inode) ;
-	return -ENOMEM;
+	retval = -ENOMEM;
+	drop_new_inode(inode) ;
+	goto out_failed ;
     }
     memcpy (name, symname, strlen (symname));
     padd_item (name, item_len, strlen (symname));
 
     journal_begin(&th, parent_dir->i_sb, jbegin_count) ;
-    windex = push_journal_writer("reiserfs_symlink") ;
 
-    inode = reiserfs_new_inode (&th, parent_dir, S_IFLNK | S_IRWXUGO, name, strlen (symname), dentry,
-				inode, &retval);
+    retval = reiserfs_new_inode(&th, parent_dir, mode, name,
+				strlen(symname), dentry, inode) ;
     reiserfs_kfree (name, item_len, parent_dir->i_sb);
-    if (inode == 0) { /* reiserfs_new_inode iputs for us */
-	pop_journal_writer(windex) ;
-	journal_end(&th, parent_dir->i_sb, jbegin_count) ;
-	return retval;
+    if (retval) {
+	goto out_failed ;
     }
 
     reiserfs_update_inode_transaction(inode) ;
@@ -870,16 +906,17 @@
     if (retval) {
 	inode->i_nlink--;
 	reiserfs_update_sd (&th, inode);
-	pop_journal_writer(windex) ;
 	journal_end(&th, parent_dir->i_sb, jbegin_count) ;
-	iput (inode);
-	return retval;
+	iput(inode) ;
+	goto out_failed ;
     }
 
     d_instantiate(dentry, inode);
-    pop_journal_writer(windex) ;
     journal_end(&th, parent_dir->i_sb, jbegin_count) ;
     return 0;
+
+out_failed:
+    return retval ;
 }
 
 
--- linux-2.4.20/fs/reiserfs/inode.c	Fri Nov 29 02:53:15 2002
+++ linux-2.4.20-t/fs/reiserfs/inode.c	Wed Jan 15 17:08:20 2003
@@ -1463,13 +1463,22 @@
 /* inserts the stat data into the tree, and then calls
    reiserfs_new_directory (to insert ".", ".." item if new object is
    directory) or reiserfs_new_symlink (to insert symlink body if new
-   object is symlink) or nothing (if new object is regular file) */
-struct inode * reiserfs_new_inode (struct reiserfs_transaction_handle *th,
-				   struct inode * dir, int mode, 
-				   const char * symname, 
-				   int i_size, /* 0 for regular, EMTRY_DIR_SIZE for dirs,
-						  strlen (symname) for symlinks)*/
-				   struct dentry *dentry, struct inode *inode, int * err)
+   object is symlink) or nothing (if new object is regular file)
+
+   NOTE! uid and gid must already be set in the inode.  If we return
+   non-zero due to an error, we have to drop the quota previously allocated
+   for the fresh inode.  This can only be done outside a transaction, so
+   if we return non-zero, we also end the transaction.
+
+   */
+int reiserfs_new_inode (struct reiserfs_transaction_handle *th,
+				struct inode * dir, int mode,
+				const char * symname,
+				/* 0 for regular, EMTRY_DIR_SIZE for dirs,
+				   strlen (symname) for symlinks) */
+				int i_size,
+				struct dentry *dentry,
+				struct inode *inode)
 {
     struct super_block * sb;
     INITIALIZE_PATH (path_to_key);
@@ -1477,11 +1486,11 @@
     struct item_head ih;
     struct stat_data sd;
     int retval;
+    int err ;
   
     if (!dir || !dir->i_nlink) {
-	*err = -EPERM;
-	iput(inode) ;
-	return NULL;
+	err = -EPERM ;
+	goto out_bad_inode ;
     }
 
     sb = dir->i_sb;
@@ -1489,13 +1498,16 @@
 	    dir -> u.reiserfs_i.i_attrs & REISERFS_INHERIT_MASK;
     sd_attrs_to_i_attrs( inode -> u.reiserfs_i.i_attrs, inode );
 
+    /* symlink cannot be immutable or append only, right? */
+    if( S_ISLNK( inode -> i_mode ) )
+	    inode -> i_flags &= ~ ( S_IMMUTABLE | S_APPEND );
+
     /* item head of new item */
     ih.ih_key.k_dir_id = INODE_PKEY (dir)->k_objectid;
     ih.ih_key.k_objectid = cpu_to_le32 (reiserfs_get_unused_objectid (th));
     if (!ih.ih_key.k_objectid) {
-	iput(inode) ;
-	*err = -ENOMEM;
-	return NULL;
+	err = -ENOMEM ;
+	goto out_bad_inode ;
     }
     if (old_format_only (sb))
       /* not a perfect generation count, as object ids can be reused, but this
@@ -1511,12 +1523,24 @@
 #else
       inode->i_generation = ++event;
 #endif
+    /* fill stat data */
+    inode->i_nlink = (S_ISDIR (mode) ? 2 : 1);
+
+    /* uid and gid must already be set by the caller for quota init */
+
+    inode->i_mtime = inode->i_atime = inode->i_ctime = CURRENT_TIME;
+    inode->i_size = i_size;
+    inode->i_blocks = (inode->i_size + 511) >> 9;
+    inode->u.reiserfs_i.i_first_direct_byte = S_ISLNK(mode) ? 1 : 
+      U32_MAX/*NO_BYTES_IN_DIRECT_ITEM*/;
+
+    INIT_LIST_HEAD(&inode->u.reiserfs_i.i_prealloc_list) ;
+
     if (old_format_only (sb))
 	make_le_item_head (&ih, 0, KEY_FORMAT_3_5, SD_OFFSET, TYPE_STAT_DATA, SD_V1_SIZE, MAX_US_INT);
     else
 	make_le_item_head (&ih, 0, KEY_FORMAT_3_6, SD_OFFSET, TYPE_STAT_DATA, SD_SIZE, MAX_US_INT);
 
-
     /* key to search for correct place for new stat data */
     _make_cpu_key (&key, KEY_FORMAT_3_6, le32_to_cpu (ih.ih_key.k_dir_id),
 		   le32_to_cpu (ih.ih_key.k_objectid), SD_OFFSET, TYPE_STAT_DATA, 3/*key length*/);
@@ -1524,47 +1548,21 @@
     /* find proper place for inserting of stat data */
     retval = search_item (sb, &key, &path_to_key);
     if (retval == IO_ERROR) {
-	iput (inode);
-	*err = -EIO;
-	return NULL;
+	err = -EIO;
+	goto out_bad_inode;
     }
     if (retval == ITEM_FOUND) {
 	pathrelse (&path_to_key);
-	iput (inode);
-	*err = -EEXIST;
-	return NULL;
+	err = -EEXIST;
+	goto out_bad_inode;
     }
 
-    /* fill stat data */
-    inode->i_mode = mode;
-    inode->i_nlink = (S_ISDIR (mode) ? 2 : 1);
-    inode->i_uid = current->fsuid;
-    if (dir->i_mode & S_ISGID) {
-	inode->i_gid = dir->i_gid;
-	if (S_ISDIR(mode))
-	    inode->i_mode |= S_ISGID;
-    } else
-	inode->i_gid = current->fsgid;
-
-    /* symlink cannot be immutable or append only, right? */
-    if( S_ISLNK( inode -> i_mode ) )
-	    inode -> i_flags &= ~ ( S_IMMUTABLE | S_APPEND );
-
-    inode->i_mtime = inode->i_atime = inode->i_ctime = CURRENT_TIME;
-    inode->i_size = i_size;
-    inode->i_blocks = (inode->i_size + 511) >> 9;
-    inode->u.reiserfs_i.i_first_direct_byte = S_ISLNK(mode) ? 1 : 
-      U32_MAX/*NO_BYTES_IN_DIRECT_ITEM*/;
-
-    INIT_LIST_HEAD(&inode->u.reiserfs_i.i_prealloc_list) ;
-
     if (old_format_only (sb)) {
 	if (inode->i_uid & ~0xffff || inode->i_gid & ~0xffff) {
 	    pathrelse (&path_to_key);
 	    /* i_uid or i_gid is too big to be stored in stat data v3.5 */
-	    iput (inode);
-	    *err = -EINVAL;
-	    return NULL;
+	    err = -EINVAL;
+	    goto out_bad_inode;
 	}
 	inode2sd_v1 (&sd, inode);
     } else
@@ -1595,10 +1593,9 @@
 #endif
     retval = reiserfs_insert_item (th, &path_to_key, &key, &ih, (char *)(&sd));
     if (retval) {
-	iput (inode);
-	*err = retval;
 	reiserfs_check_path(&path_to_key) ;
-	return NULL;
+	err = retval;
+	goto out_bad_inode;
     }
 
 #ifdef DISPLACE_NEW_PACKING_LOCALITIES
@@ -1617,19 +1614,30 @@
 	retval = reiserfs_new_symlink (th, &ih, &path_to_key, symname, i_size);
     }
     if (retval) {
-      inode->i_nlink = 0;
-	iput (inode);
-	*err = retval;
+	err = retval;
 	reiserfs_check_path(&path_to_key) ;
-	return NULL;
+	journal_end(th, th->t_super, th->t_blocks_allocated) ;
+	goto out_inserted_sd;
     }
 
     insert_inode_hash (inode);
-    // we do not mark inode dirty: on disk content matches to the
-    // in-core one
+    reiserfs_update_sd(th, inode) ;
     reiserfs_check_path(&path_to_key) ;
 
-    return inode;
+    return 0;
+out_bad_inode:
+    /* Invalidate the object, nothing was inserted yet */
+    INODE_PKEY(inode)->k_objectid = 0;
+
+    /* dquot_drop must be done outside a transaction */
+    journal_end(th, th->t_super, th->t_blocks_allocated) ;
+    make_bad_inode(inode);
+
+out_inserted_sd:
+    inode->i_nlink = 0;
+    th->t_trans_id = 0 ; /* so the caller can't use this handle later */
+    iput(inode) ;
+    return err;
 }
 
 /*
--- linux-2.4.20/include/linux/reiserfs_fs.h	Tue Dec 10 13:47:47 2002
+++ linux-2.4.20-t/include/linux/reiserfs_fs.h	Wed Jan 15 15:51:06 2003
@@ -1748,11 +1748,12 @@
 struct inode * reiserfs_iget (struct super_block * s, 
 			      const struct cpu_key * key);
 
-
-struct inode * reiserfs_new_inode (struct reiserfs_transaction_handle *th, 
-				   struct inode * dir, int mode, 
-				   const char * symname, int item_len,
-				   struct dentry *dentry, struct inode *inode, int * err);
+int reiserfs_new_inode (struct reiserfs_transaction_handle *th,
+                               struct inode * dir, int mode,
+                               const char * symname,
+                               int i_size,
+                               struct dentry *dentry,
+                               struct inode *inode);
 int reiserfs_sync_inode (struct reiserfs_transaction_handle *th, struct inode * inode);
 void reiserfs_update_sd (struct reiserfs_transaction_handle *th, struct inode * inode);
 

^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: Stress test failure for reiserfs, but ext3 ok on Linux 2.4.20
  2003-01-15 15:01       ` Oleg Drokin
@ 2003-01-15 15:48         ` Bernhard Sadlowski
  2003-01-15 15:51           ` Oleg Drokin
  0 siblings, 1 reply; 7+ messages in thread
From: Bernhard Sadlowski @ 2003-01-15 15:48 UTC (permalink / raw)
  To: Oleg Drokin; +Cc: reiserfs-list

On 15 Jan 2003 18:01, Oleg Drokin <green@namesys.com> wrote:
> Ok, here is the patch, can you give it a try and see if it also helps?
> I tested it locally and it works for me.
> If you confirm everything is ok, I will try to get it into 2.4.21 in time.

At first glance it seems to work. I will run now that script overnight
and will tell you, if any problems arise.

Thanks,
Bernhard

^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: Stress test failure for reiserfs, but ext3 ok on Linux 2.4.20
  2003-01-15 15:48         ` Bernhard Sadlowski
@ 2003-01-15 15:51           ` Oleg Drokin
  0 siblings, 0 replies; 7+ messages in thread
From: Oleg Drokin @ 2003-01-15 15:51 UTC (permalink / raw)
  To: Bernhard Sadlowski; +Cc: reiserfs-list

Hello!

On Wed, Jan 15, 2003 at 04:48:52PM +0100, Bernhard Sadlowski wrote:
> > Ok, here is the patch, can you give it a try and see if it also helps?
> > I tested it locally and it works for me.
> > If you confirm everything is ok, I will try to get it into 2.4.21 in time.
> At first glance it seems to work. I will run now that script overnight
> and will tell you, if any problems arise.

Ok, Thank you very much.

Bye,
    Oleg

^ permalink raw reply	[flat|nested] 7+ messages in thread

end of thread, other threads:[~2003-01-15 15:51 UTC | newest]

Thread overview: 7+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2003-01-15  2:35 Stress test failure for reiserfs, but ext3 ok on Linux 2.4.20 Bernhard Sadlowski
2003-01-15 10:43 ` Oleg Drokin
2003-01-15 11:51   ` Bernhard Sadlowski
2003-01-15 11:58     ` Oleg Drokin
2003-01-15 15:01       ` Oleg Drokin
2003-01-15 15:48         ` Bernhard Sadlowski
2003-01-15 15:51           ` Oleg Drokin

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.