Oops with in nfsd - 2.4.19-pre6

All of lore.kernel.org
 help / color / mirror / Atom feed

* Oops with in nfsd - 2.4.19-pre6
@ 2002-10-29 14:59 Philippe Gramoullé
  2002-10-29 15:14 ` Oleg Drokin
  0 siblings, 1 reply; 21+ messages in thread
From: Philippe Gramoullé @ 2002-10-29 14:59 UTC (permalink / raw)
  To: reiserfs


Hi,

Is this still the infamous NFSD/inode race ?

Thanks,

Philippe.


root@smembf08:~# cat reiser-oops | ksymoops -m /boot/System.map-2.4.19-pre6 
ksymoops 2.4.5 on i686 2.4.19-pre6.  Options used
     -V (default)
     -k /proc/ksyms (default)
     -l /proc/modules (default)
     -o /lib/modules/2.4.19-pre6/ (default)
     -m /boot/System.map-2.4.19-pre6 (specified)

kernel BUG at prints.c:334!
invalid operand: 0000
CPU:    0
EIP:    0010:[<c0194899>]    Not tainted
Using defaults from ksymoops -t elf32-i386 -a i386
EFLAGS: 00010286
eax: 0000003d   ebx: c0268640   ecx: ffffffff   edx: 00000002
esi: f781c400   edi: 00000019   ebp: 00000000   esp: f6125af4
ds: 0018   es: 0018   ss: 0018
Process nfsd (pid: 288, stackpage=f6125000)
Stack: c02650da c031f5e0 c0268640 f6125b18 f6125ec8 f6fbad20 c01a1bcc f781c400 
       c0268640 00000400 f781c400 000007b1 f6125ec8 03d8eefe ffffffff 00000000 
       c0184144 f6125ec8 f781c400 f6fbad20 f6125ec8 03d8eefe 00000001 e1ae7de0 
Call Trace: [<c01a1bcc>] [<c0184144>] [<c0184283>] [<c019b23b>] [<c019c08f>] 
   [<c019c879>] [<c019bcbf>] [<c018a09a>] [<c0189fd0>] [<c014bd67>] [<c0149fb6>] 
   [<c014279d>] [<c0175d6b>] [<c0179dee>] [<c0170d63>] [<c0238405>] [<c0170b5f>] 
   [<c0106ff4>] 
Code: 0f 0b 4e 01 e0 50 26 c0 68 e0 f5 31 c0 85 f6 74 16 0f b7 46 


>>EIP; c0194899 <reiserfs_panic+29/60>   <=====

>>ebx; c0268640 <MAX_KEY+2220/4aa8>
>>ecx; ffffffff <END_OF_CODE+76caea0/????>
>>esi; f781c400 <_end+374d4c24/385e7824>
>>esp; f6125af4 <_end+35dde318/385e7824>

Trace; c01a1bcc <journal_mark_dirty+160/320>
Trace; c0184144 <_reiserfs_free_block+b4/1cc>
Trace; c0184283 <reiserfs_free_block+27/30>
Trace; c019b23b <prepare_for_delete_or_cut+663/70c>
Trace; c019c08f <reiserfs_cut_from_item+97/520>
Trace; c019c879 <reiserfs_do_truncate+2fd/42c>
Trace; c019bcbf <reiserfs_delete_object+23/50>
Trace; c018a09a <reiserfs_delete_inode+ca/13c>
Trace; c0189fd0 <reiserfs_delete_inode+0/13c>
Trace; c014bd67 <iput+163/268>
Trace; c0149fb6 <d_delete+62/a0>
Trace; c014279d <vfs_unlink+1e9/220>
Trace; c0175d6b <nfsd_unlink+193/1e0>
Trace; c0179dee <nfsd3_proc_remove+c6/d4>
Trace; c0170d63 <nfsd_dispatch+d3/19a>
Trace; c0238405 <svc_process+28d/4d4>
Trace; c0170b5f <nfsd+1f7/328>
Trace; c0106ff4 <kernel_thread+28/38>

Code;  c0194899 <reiserfs_panic+29/60>
00000000 <_EIP>:
Code;  c0194899 <reiserfs_panic+29/60>   <=====
   0:   0f 0b                     ud2a      <=====
Code;  c019489b <reiserfs_panic+2b/60>
   2:   4e                        dec    %esi
Code;  c019489c <reiserfs_panic+2c/60>
   3:   01 e0                     add    %esp,%eax
Code;  c019489e <reiserfs_panic+2e/60>
   5:   50                        push   %eax
Code;  c019489f <reiserfs_panic+2f/60>
   6:   26 c0 68 e0 f5            shrb   $0xf5,%es:0xffffffe0(%eax)
Code;  c01948a4 <reiserfs_panic+34/60>
   b:   31 c0                     xor    %eax,%eax
Code;  c01948a6 <reiserfs_panic+36/60>
   d:   85 f6                     test   %esi,%esi
Code;  c01948a8 <reiserfs_panic+38/60>
   f:   74 16                     je     27 <_EIP+0x27> c01948c0 <reiserfs_panic+50/60>
Code;  c01948aa <reiserfs_panic+3a/60>
  11:   0f b7 46 00               movzwl 0x0(%esi),%eax

^ permalink raw reply	[flat|nested] 21+ messages in thread

* Re: Oops with in nfsd - 2.4.19-pre6
  2002-10-29 14:59 Philippe Gramoullé
@ 2002-10-29 15:14 ` Oleg Drokin
  2002-10-29 15:20   ` Philippe Gramoullé
  0 siblings, 1 reply; 21+ messages in thread
From: Oleg Drokin @ 2002-10-29 15:14 UTC (permalink / raw)
  To: Philippe Gramoull?; +Cc: reiserfs, mason

Hello!

On Tue, Oct 29, 2002 at 03:59:07PM +0100, Philippe Gramoull? wrote:

> Is this still the infamous NFSD/inode race ?

No.
This is a bug in journalling code.
Something related to improper transaction blocks accounting.
Chris said he will try take care of it and rejected my 
simple, but invalid patch.

Chris, any news on that?

> >>EIP; c0194899 <reiserfs_panic+29/60>   <=====
> Trace; c01a1bcc <journal_mark_dirty+160/320>

Bye,
    Oleg

^ permalink raw reply	[flat|nested] 21+ messages in thread

* Re: Oops with in nfsd - 2.4.19-pre6
  2002-10-29 15:14 ` Oleg Drokin
@ 2002-10-29 15:20   ` Philippe Gramoullé
  2002-10-29 15:26     ` Oleg Drokin
  2002-10-31 20:38     ` Chris Mason
  0 siblings, 2 replies; 21+ messages in thread
From: Philippe Gramoullé @ 2002-10-29 15:20 UTC (permalink / raw)
  To: Oleg Drokin; +Cc: reiserfs-list, mason


FYI,

This in the log , before the oops :

 journal-1413: journal_mark_dirty: j_len (1024) is too big

and filer is low on space :o)

/dev/sdb1            572418604 572341220     77384 100% /storage

Should it be because filer hit 0 on space ?

Thanks,

philippe

On Tue, 29 Oct 2002 18:14:38 +0300
Oleg Drokin <green@namesys.com> wrote:

  |  Hello!
  |  
  |  On Tue, Oct 29, 2002 at 03:59:07PM +0100, Philippe Gramoull? wrote:
  |  
  |  > Is this still the infamous NFSD/inode race ?
  |  
  |  No.
  |  This is a bug in journalling code.
  |  Something related to improper transaction blocks accounting.
  |  Chris said he will try take care of it and rejected my 
  |  simple, but invalid patch.
  |  
  |  Chris, any news on that?
  |  
  |  > >>EIP; c0194899 <reiserfs_panic+29/60>   <=====
  |  > Trace; c01a1bcc <journal_mark_dirty+160/320>
  |  
  |  Bye,
  |      Oleg
  |  

^ permalink raw reply	[flat|nested] 21+ messages in thread

* Re: Oops with in nfsd - 2.4.19-pre6
  2002-10-29 15:20   ` Philippe Gramoullé
@ 2002-10-29 15:26     ` Oleg Drokin
  2002-10-31 20:38     ` Chris Mason
  1 sibling, 0 replies; 21+ messages in thread
From: Oleg Drokin @ 2002-10-29 15:26 UTC (permalink / raw)
  To: Philippe Gramoull?; +Cc: reiserfs-list, mason

Hello!

On Tue, Oct 29, 2002 at 04:20:14PM +0100, Philippe Gramoull? wrote:

> This in the log , before the oops :
>  journal-1413: journal_mark_dirty: j_len (1024) is too big

Yes, this is the assertion that failed (jlen to be less than TRANS_MAX
something).

> and filer is low on space :o)
> /dev/sdb1            572418604 572341220     77384 100% /storage
> Should it be because filer hit 0 on space ?

No, the free space on disk is irrelevant I believe.
Journal is on its own separate area anyway.

Bye,
    Oleg

^ permalink raw reply	[flat|nested] 21+ messages in thread

* Re: Oops with in nfsd - 2.4.19-pre6
  2002-10-29 15:20   ` Philippe Gramoullé
  2002-10-29 15:26     ` Oleg Drokin
@ 2002-10-31 20:38     ` Chris Mason
  1 sibling, 0 replies; 21+ messages in thread
From: Chris Mason @ 2002-10-31 20:38 UTC (permalink / raw)
  To: Philippe Gramoullé; +Cc: Oleg Drokin, reiserfs-list

On Tue, 2002-10-29 at 10:20, Philippe Gramoullé wrote:
> 
> FYI,
> 
> This in the log , before the oops :
> 
>  journal-1413: journal_mark_dirty: j_len (1024) is too big
> 
>
Ok, this patch goes on top of the quota patches because that is what
Philippe is currently running.  Merging with the pure kernel is trivial,
so I'll do that later.

Since Philippe is putting this onto production machines, I think he
should wait for someone from namesys to review before using the patch.

The idea is that during boundless operations (creating a hole, and
truncates), the journal code wasn't properly reserving log blocks. 
There are two parts to the fix:

1) always reserve extra log blocks when
reiserfs_transaction_should_end() returns 0

2) always send the correct number of log blocks to
reiserfs_transaction_should_end()

#2 also makes hole creation significantly faster.  Before, it used the
number of blocks logged in the last transaction as the number if will
log in the next one, which means it might try to reserve 300 or so log
blocks.  This would usually force the current transcation to close,
leading to a small transaction and lower performance.

Anyway, here's the patch:

-chris

#
# against 2.4.19 + quota + nesting
#
# when testing for a transaction restart, make sure to bump the number
# of blocks allocated, otherwise by the time you get around to 
# journal_mark_dirty, the blocks might have been used by a different
# writer
#
# when restarting the transaction in reiserfs_get_block and truncates,
# don't use th->t_blocks_allocated as the count for the new transaction.
# It gets incremented as the transation grows during the boundless op,
# and might get very very large.
#
diff -Nru a/fs/reiserfs/inode.c b/fs/reiserfs/inode.c
--- a/fs/reiserfs/inode.c	Thu Oct 31 15:21:49 2002
+++ b/fs/reiserfs/inode.c	Thu Oct 31 15:21:49 2002
@@ -232,9 +232,9 @@
 }
 
 /*static*/ void restart_transaction(struct reiserfs_transaction_handle *th,
-				struct inode *inode, struct path *path) {
+				struct inode *inode, struct path *path,
+				int jbegin_count) {
   struct super_block *s = th->t_super ;
-  int len = th->t_blocks_allocated ;
 
   /* we cannot restart while nested */
   if (th->t_refcount > 1) {
@@ -242,8 +242,8 @@
   }
   pathrelse(path) ;
   reiserfs_update_sd(th, inode) ;
-  journal_end(th, s, len) ;
-  journal_begin(th, s, len) ;
+  journal_end(th, s, th->t_blocks_allocated) ;
+  journal_begin(th, s, jbegin_count) ;
   reiserfs_update_inode_transaction(inode) ;
 }
 
@@ -655,7 +655,7 @@
 	    ** some blocks.  releases the path, so we have to go back to
 	    ** research if we succeed on the second try
 	    */
-	    restart_transaction(&th, inode, &path) ; 
+	    restart_transaction(&th, inode, &path, jbegin_count) ; 
 	    repeat = _allocate_block(&th, inode,&allocated_block_nr,tag,create);
 
 	    if (repeat != NO_DISK_SPACE && repeat != QUOTA_EXCEEDED) {
@@ -856,8 +856,8 @@
 	** release the path so that anybody waiting on the path before
 	** ending their transaction will be able to continue.
 	*/
-	if (journal_transaction_should_end(&th, th.t_blocks_allocated)) {
-	  restart_transaction(&th, inode, &path) ; 
+	if (journal_transaction_should_end(&th, jbegin_count)) {
+	  restart_transaction(&th, inode, &path, jbegin_count) ; 
 	}
 	/* inserting indirect pointers for a hole can take a 
 	** long time.  reschedule if needed
diff -Nru a/fs/reiserfs/journal.c b/fs/reiserfs/journal.c
--- a/fs/reiserfs/journal.c	Thu Oct 31 15:21:49 2002
+++ b/fs/reiserfs/journal.c	Thu Oct 31 15:21:49 2002
@@ -2011,6 +2011,12 @@
        SB_JOURNAL(th->t_super)->j_cnode_free < (JOURNAL_TRANS_MAX * 3)) { 
     return 1 ;
   }
+  
+  /* we are allowing them to continue in the current transaction, so we
+   * have to bump the blocks allocated now.
+   */
+  th->t_blocks_allocated += new_alloc;
+  SB_JOURNAL(th->t_super)->j_len_alloc += new_alloc ;
   return 0 ;
 }
 
diff -Nru a/fs/reiserfs/stree.c b/fs/reiserfs/stree.c
--- a/fs/reiserfs/stree.c	Thu Oct 31 15:21:49 2002
+++ b/fs/reiserfs/stree.c	Thu Oct 31 15:21:49 2002
@@ -1730,6 +1730,7 @@
 	n_new_file_size;/* New file size. */
     int                   n_deleted;      /* Number of deleted or truncated bytes. */
     int retval;
+    int jbegin_count = th->t_blocks_allocated;
 
     if ( ! (S_ISREG(p_s_inode->i_mode) || S_ISDIR(p_s_inode->i_mode) || S_ISLNK(p_s_inode->i_mode)) )
 	return;
@@ -1809,16 +1810,15 @@
 	** sure the file is consistent before ending the current trans
 	** and starting a new one
 	*/
-        if (journal_transaction_should_end(th, th->t_blocks_allocated)) {
-	  int orig_len_alloc = th->t_blocks_allocated ;
+        if (journal_transaction_should_end(th, jbegin_count)) {
 	  decrement_counters_in_path(&s_search_path) ;
 
 	  if (update_timestamps) {
 	      p_s_inode->i_mtime = p_s_inode->i_ctime = CURRENT_TIME;
 	  } 
 	  reiserfs_update_sd(th, p_s_inode) ;
-	  journal_end(th, p_s_inode->i_sb, orig_len_alloc) ;
-	  journal_begin(th, p_s_inode->i_sb, orig_len_alloc) ;
+	  journal_end(th, p_s_inode->i_sb, th->t_blocks_allocated) ;
+	  journal_begin(th, p_s_inode->i_sb, jbegin_count) ;
 	  reiserfs_update_inode_transaction(p_s_inode) ;
 	}
     } while ( n_file_size > ROUND_UP (n_new_file_size) &&



^ permalink raw reply	[flat|nested] 21+ messages in thread

* Re: Oops with in nfsd - 2.4.19-pre6
@ 2002-10-31 21:08 JP Howard
  2002-10-31 21:30 ` Chris Mason
  0 siblings, 1 reply; 21+ messages in thread
From: JP Howard @ 2002-10-31 21:08 UTC (permalink / raw)
  To: Chris Mason, Philippe Gramoullé; +Cc: Oleg Drokin, ReiserFS List

On 31 Oct 2002 15:38:19 -0500, "Chris Mason" <mason@suse.com> said:
<...>
> The idea is that during boundless operations (creating a hole, and
> truncates), the journal code wasn't properly reserving log blocks. 
<...>

Chris, what can trigger this situation? We're currently running
data=journal on 2.4.20pre in production--are we at risk?

^ permalink raw reply	[flat|nested] 21+ messages in thread

* Re: Oops with in nfsd - 2.4.19-pre6
  2002-10-31 21:08 Oops with in nfsd - 2.4.19-pre6 JP Howard
@ 2002-10-31 21:30 ` Chris Mason
  2002-11-01 14:46   ` Christopher Barry
  2002-11-12 17:29   ` Philippe Gramoullé
  0 siblings, 2 replies; 21+ messages in thread
From: Chris Mason @ 2002-10-31 21:30 UTC (permalink / raw)
  To: JP Howard; +Cc: Philippe Gramoullé, Oleg Drokin, ReiserFS List

On Thu, 2002-10-31 at 16:08, JP Howard wrote:
> On 31 Oct 2002 15:38:19 -0500, "Chris Mason" <mason@suse.com> said:
> <...>
> > The idea is that during boundless operations (creating a hole, and
> > truncates), the journal code wasn't properly reserving log blocks. 
> <...>
> 
> Chris, what can trigger this situation? We're currently running
> data=journal on 2.4.20pre in production--are we at risk?
> 

This bug is pretty hard to hit.  It has been in every single version of
journaling reiserfs, including 2.2.x.  So far, we've gotten two reports
of it in about 3 years (oddly, both were this month).

What can trigger it?  I honestly haven't been able to force the problem
to happen, it should require a very high load of processes doing
deletions (or hole creations), along with a very high system load in
general.

The logging code padds all the reservations for space in the log, making
it very hard to hit the hard limit of 1024 blocks per transactions.

Both sites that have hit the bug have a very large number of files
(millions), meaning that metadata operations will tend to log more
blocks, making the bug more likely.

If you have less than a million files, you'll probably never be able to
hit it.  I'm still going to try and get the fix into 2.4.20 though.

-chris

^ permalink raw reply	[flat|nested] 21+ messages in thread

* Re: Oops with in nfsd - 2.4.19-pre6
@ 2002-10-31 22:12 JP Howard
  0 siblings, 0 replies; 21+ messages in thread
From: JP Howard @ 2002-10-31 22:12 UTC (permalink / raw)
  To: Chris Mason; +Cc: Philippe Gramoullé, Oleg Drokin, ReiserFS List

On 31 Oct 2002 16:30:04 -0500, "Chris Mason" <mason@suse.com> said:
> This bug is pretty hard to hit.  It has been in every single version of
> journaling reiserfs, including 2.2.x.  So far, we've gotten two reports
> of it in about 3 years (oddly, both were this month).
> 
<...>
> If you have less than a million files, you'll probably never be able to
> hit it.  I'm still going to try and get the fix into 2.4.20 though.

We have over a million files. I'd love to see a patch against 2.4.20
that's been checked by the Namesys crew if possible, even if it doesn't
make it into 2.4.20-release.

Many thanks,
  Jeremy

^ permalink raw reply	[flat|nested] 21+ messages in thread

* Re: Oops with in nfsd - 2.4.19-pre6
  2002-10-31 21:30 ` Chris Mason
@ 2002-11-01 14:46   ` Christopher Barry
  2002-11-12 17:29   ` Philippe Gramoullé
  1 sibling, 0 replies; 21+ messages in thread
From: Christopher Barry @ 2002-11-01 14:46 UTC (permalink / raw)
  To: Chris Mason
  Cc: JP Howard, Philippe Gramoullé, Oleg Drokin, ReiserFS List

On Thu, 2002-10-31 at 16:30, Chris Mason wrote:
> On Thu, 2002-10-31 at 16:08, JP Howard wrote:
> > On 31 Oct 2002 15:38:19 -0500, "Chris Mason" <mason@suse.com> said:
> > <...>
> > > The idea is that during boundless operations (creating a hole, and
> > > truncates), the journal code wasn't properly reserving log blocks. 
> > <...>
> > 
> > Chris, what can trigger this situation? We're currently running
> > data=journal on 2.4.20pre in production--are we at risk?
> > 
> 
> This bug is pretty hard to hit.  It has been in every single version of
> journaling reiserfs, including 2.2.x.  So far, we've gotten two reports
> of it in about 3 years (oddly, both were this month).
> 
> What can trigger it?  I honestly haven't been able to force the problem
> to happen, it should require a very high load of processes doing
> deletions (or hole creations), along with a very high system load in
> general.
> 
> The logging code padds all the reservations for space in the log, making
> it very hard to hit the hard limit of 1024 blocks per transactions.
> 
> Both sites that have hit the bug have a very large number of files
> (millions), meaning that metadata operations will tend to log more
> blocks, making the bug more likely.
> 
> If you have less than a million files, you'll probably never be able to
> hit it.  I'm still going to try and get the fix into 2.4.20 though.
> 
> -chris
> 
> 
Off-topic, and not meaning to scold, but _why_ are you running 2.4.20pre
in a _production_ environment anyway? Just curious.

-C



^ permalink raw reply	[flat|nested] 21+ messages in thread

* Re: Oops with in nfsd - 2.4.19-pre6
  2002-10-31 21:30 ` Chris Mason
  2002-11-01 14:46   ` Christopher Barry
@ 2002-11-12 17:29   ` Philippe Gramoullé
  2002-11-13 18:14     ` Chris Mason
  1 sibling, 1 reply; 21+ messages in thread
From: Philippe Gramoullé @ 2002-11-12 17:29 UTC (permalink / raw)
  To: Chris Mason; +Cc: reiserfs-list, jh_lists, green


Hi Chris,

On 31 Oct 2002 16:30:04 -0500
Chris Mason <mason@suse.com> wrote:

  |  On Thu, 2002-10-31 at 16:08, JP Howard wrote:
  |  > On 31 Oct 2002 15:38:19 -0500, "Chris Mason" <mason@suse.com> said:
  |  > <...>
  |  > > The idea is that during boundless operations (creating a hole, and
  |  > > truncates), the journal code wasn't properly reserving log blocks. 
  |  > <...>
  |  > 
  |  > Chris, what can trigger this situation? We're currently running
  |  > data=journal on 2.4.20pre in production--are we at risk?
  |  > 
  |  
  |  This bug is pretty hard to hit.  It has been in every single version of
  |  journaling reiserfs, including 2.2.x.  So far, we've gotten two reports
  |  of it in about 3 years (oddly, both were this month).

We hit it almost everyday last week.

Here is another one , from few minutes ago :-(

ksymoops 2.4.5 on i686 2.4.19-pre6.  Options used
     -V (default)
     -k /proc/ksyms (default)
     -l /proc/modules (default)
     -o /lib/modules/2.4.19-pre6/ (default)
     -m /boot/System.map-2.4.19-pre6 (specified)

kernel BUG at prints.c:334!
invalid operand: 0000
CPU:    0
EIP:    0010:[reiserfs_panic+41/96]    Not tainted
EFLAGS: 00010286
eax: 0000003d   ebx: c0268640   ecx: 00000002   edx: 02000000
esi: f7e4d800   edi: 00000019   ebp: 00000000   esp: f43f1af4
ds: 0018   es: 0018   ss: 0018
Process nfsd (pid: 230, stackpage=f43f1000)
Stack: c02650da c031f5e0 c0268640 f43f1b18 f43f1ec8 f538eae0 c01a1bcc f7e4d800 
       c0268640 00000400 f7e4d800 000004cb f43f1ec8 0265bb4f ffffffff 00000000 
       c0184144 f43f1ec8 f7e4d800 f538eae0 f43f1ec8 0265bb4f 00000001 cb6dd4a0 
Call Trace: [journal_mark_dirty+352/800] [_reiserfs_free_block+180/460] [reiserfs_free_block+39/48] [prepare_for_delete_or_cut+1635/1804] [reiserfs_cut_from_item+151/1312] [reiserfs_do_truncate+765/1068] [reiserfs_delete_object+35/80] [reiserfs_delete_inode+202/316] [reiserfs_delete_inode+0/316] [iput+355/616] [d_delete+98/160] [vfs_unlink+489/544] [nfsd_unlink+403/480] [nfsd3_proc_remove+198/212] [nfsd_dispatch+211/410] [svc_process+653/1236] [nfsd+503/808] [kernel_thread+40/56] 
Code: 0f 0b 4e 01 e0 50 26 c0 68 e0 f5 31 c0 85 f6 74 16 0f b7 46
Using defaults from ksymoops -t elf32-i386 -a i386


>>ebx; c0268640 <MAX_KEY+2220/4aa8>
>>edx; 02000000 Before first symbol
>>esi; f7e4d800 <_end+37b06024/385e7824>
>>esp; f43f1af4 <_end+340aa318/385e7824>

Code;  00000000 Before first symbol
00000000 <_EIP>:
Code;  00000000 Before first symbol
   0:   0f 0b                     ud2a   
Code;  00000002 Before first symbol
   2:   4e                        dec    %esi
Code;  00000003 Before first symbol
   3:   01 e0                     add    %esp,%eax
Code;  00000005 Before first symbol
   5:   50                        push   %eax
Code;  00000006 Before first symbol
   6:   26 c0 68 e0 f5            shrb   $0xf5,%es:0xffffffe0(%eax)
Code;  0000000b Before first symbol
   b:   31 c0                     xor    %eax,%eax
Code;  0000000d Before first symbol
   d:   85 f6                     test   %esi,%esi
Code;  0000000f Before first symbol
   f:   74 16                     je     27 <_EIP+0x27> 00000027 Before first symbol
Code;  00000011 Before first symbol
  11:   0f b7 46 00               movzwl 0x0(%esi),%eax


  |  
  |  What can trigger it?  I honestly haven't been able to force the problem
  |  to happen, it should require a very high load of processes doing
  |  deletions (or hole creations), along with a very high system load in
  |  general.

Hmm, this is what we currently have :-( right now 128 NFSD thread. 2.4.19-pre6 has
a hardcoded limit and i'd like to raise it to 256 NFSd threads or so , so it would be even more likely to
hit the bug.

  |  
  |  The logging code padds all the reservations for space in the log, making
  |  it very hard to hit the hard limit of 1024 blocks per transactions.
  |  
  |  Both sites that have hit the bug have a very large number of files
  |  (millions), meaning that metadata operations will tend to log more
  |  blocks, making the bug more likely.
  |  
  |  If you have less than a million files, you'll probably never be able to
  |  hit it.  I'm still going to try and get the fix into 2.4.20 though.

Do you have a fix that could be used in 2.4.20-(pre|rc) right now ? ( we need quota as well)

This is becoming really critical and even a not 110% tested patch would be welcome.

Thanks,

Philippe.

  |  
  |  -chris
  |  
  |  
  |  

^ permalink raw reply	[flat|nested] 21+ messages in thread

* Re: Oops with in nfsd - 2.4.19-pre6
  2002-11-12 17:29   ` Philippe Gramoullé
@ 2002-11-13 18:14     ` Chris Mason
  2002-11-13 18:40       ` Philippe Gramoullé
  0 siblings, 1 reply; 21+ messages in thread
From: Chris Mason @ 2002-11-13 18:14 UTC (permalink / raw)
  To: Philippe Gramoullé; +Cc: reiserfs-list, jh_lists, green

On Tue, 2002-11-12 at 12:29, Philippe 
> Do you have a fix that could be used in 2.4.20-(pre|rc) right now ? ( we need quota as well)
> 
> This is becoming really critical and even a not 110% tested patch would be welcome.

Ok, I was hoping for some confirmation that you've seen the bug fixed
with the quota code.  Which kernel version + list of patches are you
best able to test on?  I'll make you patch against that.

-chris



^ permalink raw reply	[flat|nested] 21+ messages in thread

* Re: Oops with in nfsd - 2.4.19-pre6
  2002-11-13 18:14     ` Chris Mason
@ 2002-11-13 18:40       ` Philippe Gramoullé
  2002-11-13 20:23         ` Chris Mason
  0 siblings, 1 reply; 21+ messages in thread
From: Philippe Gramoullé @ 2002-11-13 18:40 UTC (permalink / raw)
  To: reiserfs-list

On 13 Nov 2002 13:14:18 -0500
Chris Mason <mason@suse.com> wrote:

  |  Ok, I was hoping for some confirmation that you've seen the bug fixed
  |   with the quota code.  Which kernel version + list of patches are you
  |   best able to test on?  I'll make you patch against that.
  |  
  |   -chris

Unfortunately, we hit the bug again this afternoon :-(

We use to have 2.4.19-pre6 with the quota patches from
ftp.suse.com:/pub/people/mason/patches/reiserfs/quota-2.4/2.4.19
( 0xFFFF test removed in sys_quotactl() in fs/dquot.c to support 32 bit UIDS)
and IIRC NFS_ALL patch for 2.4.19

Being unable to run the quota anyway with 2.4.19-pre6,as it would make the filer crash, i upgraded to
2.4.20rc1 + data-logging patches and mounted the fs witn noatime,nodiratime and data=journal

Filer is still heavily loaded but runs fine. Problem now is that we don't have quota working anymore
and this is a _big_ issue as warez lovers do test our systems every single minute.

BTW, should i apply the patch posted on LKML ?
Subject:2.[45] fixes for design locking bug in wait_on_page/wait_on_buffer/get_request_wait

Ideally, i'd like to stay with 2.4.20rc1 thus having working quota.
I couldn't check closely enough because of being too busy but i think many important patches have been
set into RC1 (new block allocator, etc..).
I tried to have a look to merge the quota code with RC1 but unfortunately my knowledge of quota/reiserfs
prevent me from doing anything but cosmetic merge :o)

Let me know what are you plans.

Thanks much,

Philippe

^ permalink raw reply	[flat|nested] 21+ messages in thread

* Re: Oops with in nfsd - 2.4.19-pre6
  2002-11-13 18:40       ` Philippe Gramoullé
@ 2002-11-13 20:23         ` Chris Mason
  2002-11-14  2:33           ` Chris Mason
  2002-11-14 16:04           ` Oops with in nfsd - 2.4.19-pre6 Philippe Gramoullé
  0 siblings, 2 replies; 21+ messages in thread
From: Chris Mason @ 2002-11-13 20:23 UTC (permalink / raw)
  To: Philippe Gramoullé; +Cc: reiserfs-list

On Wed, 2002-11-13 at 13:40, Philippe Gramoullé wrote:
> On 13 Nov 2002 13:14:18 -0500
> Chris Mason <mason@suse.com> wrote:
> 
>   |  Ok, I was hoping for some confirmation that you've seen the bug fixed
>   |   with the quota code.  Which kernel version + list of patches are you
>   |   best able to test on?  I'll make you patch against that.
>   |  
>   |   -chris
> 
> Unfortunately, we hit the bug again this afternoon :-(
> 
> We use to have 2.4.19-pre6 with the quota patches from
> ftp.suse.com:/pub/people/mason/patches/reiserfs/quota-2.4/2.4.19
> ( 0xFFFF test removed in sys_quotactl() in fs/dquot.c to support 32 bit UIDS)
> and IIRC NFS_ALL patch for 2.4.19
> 
> Being unable to run the quota anyway with 2.4.19-pre6,as it would make the filer crash, i upgraded to
> 2.4.20rc1 + data-logging patches and mounted the fs witn noatime,nodiratime and data=journal

Ok, I'll make a patch against that.

> 
> Filer is still heavily loaded but runs fine. Problem now is that we don't have quota working anymore
> and this is a _big_ issue as warez lovers do test our systems every single minute.
> 
> BTW, should i apply the patch posted on LKML ?
> Subject:2.[45] fixes for design locking bug in wait_on_page/wait_on_buffer/get_request_wait

No, wait for that fix to get into a kernel before using it.  There might
still be small modifications, and andrea might find a few other places
in the kernel with similar races.

The bug results in io stalls, and the machines that do see them stall
for between 10 minutes and an hour.  If you needed the patch you would
have already been complaining ;-)  The races do not cause corruptions or
crashes of any kind.

> 
> Ideally, i'd like to stay with 2.4.20rc1 thus having working quota.
> I couldn't check closely enough because of being too busy but i think many important patches have been
> set into RC1 (new block allocator, etc..).
> I tried to have a look to merge the quota code with RC1 but unfortunately my knowledge of quota/reiserfs
> prevent me from doing anything but cosmetic merge :o)
> 
> Let me know what are you plans.

Jan and I will get an updated quota patch asap, and I'll have my
attempted fix for the transaction overflow ready for you against that
set of patches in a few hours.

-chris



^ permalink raw reply	[flat|nested] 21+ messages in thread

* Re: Oops with in nfsd - 2.4.19-pre6
  2002-11-13 20:23         ` Chris Mason
@ 2002-11-14  2:33           ` Chris Mason
  2002-11-27 20:54             ` Data-Logging Progess?! (was: Oops with in nfsd - 2.4.19-pre6) Manuel Krause
  2002-11-14 16:04           ` Oops with in nfsd - 2.4.19-pre6 Philippe Gramoullé
  1 sibling, 1 reply; 21+ messages in thread
From: Chris Mason @ 2002-11-14  2:33 UTC (permalink / raw)
  To: Chris Mason; +Cc: Philippe Gramoullé, reiserfs-list

On Wed, 2002-11-13 at 15:23, Chris Mason wrote:

> > Being unable to run the quota anyway with 2.4.19-pre6,as it would make the filer crash, i upgraded to
> > 2.4.20rc1 + data-logging patches and mounted the fs witn noatime,nodiratime and data=journal
> 
> Ok, I'll make a patch against that.

Ok, I'm sending a beta of the patch, it has been through light testing. 
I'm going to run extended tests overnight (in data=journal mode for
Philippe), but I wanted to give Philippe the chance to use this on his
own non-critical machines as well.

The patch is fairly simple, but the code involved can be very subtle to
I'm trying to be as cautious as possible.

This is against 2.4.20-rc1 + the data logging code.  It should fix
possible transaction overflows in the journal by pre-reserving space in
the transaction during unbounded operations (truncate and hole
creation).

It also shows that I owe namesys 1 beer, because it makes hole creation
a whole lot faster (I've been blaming the balancing code for poor hole
creation speed).

-chris

--- 1.40/fs/reiserfs/inode.c	Wed Nov 13 17:47:56 2002
+++ edited/fs/reiserfs/inode.c	Wed Nov 13 19:20:39 2002
@@ -205,14 +205,15 @@
 }
 
 static void restart_transaction(struct reiserfs_transaction_handle *th,
-				struct inode *inode, struct path *path) {
+				struct inode *inode, struct path *path,
+				int jbegin_count) {
   /* we cannot restart while nested unless the parent allows it */
   if (!reiserfs_restartable_handle(th) && th->t_refcount > 1) {
       return  ;
   }
   pathrelse(path) ;
   reiserfs_update_sd(th, inode) ;
-  reiserfs_restart_transaction(th) ;
+  reiserfs_restart_transaction(th, jbegin_count) ;
   reiserfs_update_inode_transaction(inode) ;
 }
 
@@ -652,7 +653,7 @@
 	    ** some blocks.  releases the path, so we have to go back to
 	    ** research if we succeed on the second try
 	    */
-	    restart_transaction(th, inode, &path) ; 
+	    restart_transaction(th, inode, &path, jbegin_count) ; 
 	    repeat = _allocate_block(th, block, inode, &allocated_block_nr, NULL, create);
 
 	    if (repeat != NO_DISK_SPACE) {
@@ -882,8 +883,8 @@
 	** this only happens when inserting holes into the file, so it
 	** does not affect data=ordered safety at all
 	*/
-	if (journal_transaction_should_end(th, th->t_blocks_allocated)) {
-	    restart_transaction(th, inode, &path) ; 
+	if (journal_transaction_should_end(th, jbegin_count)) {
+	    restart_transaction(th, inode, &path, jbegin_count) ; 
 	}
 	/* inserting indirect pointers for a hole can take a 
 	** long time.  reschedule if needed
--- 1.25/fs/reiserfs/journal.c	Wed Nov 13 17:47:56 2002
+++ edited/fs/reiserfs/journal.c	Wed Nov 13 21:23:47 2002
@@ -2532,8 +2532,6 @@
 */
 int journal_transaction_should_end(struct reiserfs_transaction_handle *th, int new_alloc) {
   time_t now = CURRENT_TIME ;
-  if (reiserfs_dont_log(th->t_super)) 
-    return 0 ;
 
   /* cannot restart while nested unless the parent allows it */
   if (!reiserfs_restartable_handle(th) && th->t_refcount > 1)
@@ -2545,13 +2543,20 @@
        SB_JOURNAL(th->t_super)->j_cnode_free < (SB_JOURNAL_TRANS_MAX(th->t_super) * 3)) { 
     return 1 ;
   }
+
+  /* we are allowing them to continue in the current transaction, so
+  * we have to bump the blocks allocated now.
+  */
+  th->t_blocks_allocated += new_alloc;
+  SB_JOURNAL(th->t_super)->j_len_alloc += new_alloc;
+
   return 0 ;
 }
 
-int reiserfs_restart_transaction(struct reiserfs_transaction_handle *th) {
+int 
+reiserfs_restart_transaction(struct reiserfs_transaction_handle *th, int num) {
     int refcount = th->t_refcount ;
     struct super_block *s = th->t_super ;
-    int num = th->t_blocks_allocated ;
     int flags = th->t_flags ;
     int parent_flags = 0;
     struct reiserfs_transaction_handle *saved_th = current->journal_info ;
@@ -2568,7 +2573,7 @@
 	parent_flags = saved_th->t_flags ;
     }
     th->t_flags = 0 ;
-    journal_end(th, s, num) ;
+    journal_end(th, s, th->t_blocks_allocated) ;
     journal_begin(th, s, num) ;
     th->t_flags = flags; 
     if (refcount > 1) {
--- 1.21/fs/reiserfs/stree.c	Wed Nov 13 17:47:56 2002
+++ edited/fs/reiserfs/stree.c	Wed Nov 13 19:29:01 2002
@@ -1705,6 +1705,7 @@
 	n_new_file_size;/* New file size. */
     int                   n_deleted;      /* Number of deleted or truncated bytes. */
     int retval;
+    int jbegin_count = th->t_blocks_allocated;
 
     if ( ! (S_ISREG(p_s_inode->i_mode) || S_ISDIR(p_s_inode->i_mode) || S_ISLNK(p_s_inode->i_mode)) )
 	return;
@@ -1784,14 +1785,14 @@
 	** sure the file is consistent before ending the current trans
 	** and starting a new one
 	*/
-        if (journal_transaction_should_end(th, th->t_blocks_allocated)) {
+        if (journal_transaction_should_end(th, jbegin_count)) {
 	  decrement_counters_in_path(&s_search_path) ;
 
 	  if (update_timestamps) {
 	      p_s_inode->i_mtime = p_s_inode->i_ctime = CURRENT_TIME;
 	  } 
 	  reiserfs_update_sd(th, p_s_inode) ;
-	  reiserfs_restart_transaction(th) ;
+	  reiserfs_restart_transaction(th, jbegin_count) ;
 	  reiserfs_update_inode_transaction(p_s_inode) ;
 	}
     } while ( n_file_size > ROUND_UP (n_new_file_size) &&
--- 1.26/include/linux/reiserfs_fs.h	Wed Nov 13 17:47:56 2002
+++ edited/include/linux/reiserfs_fs.h	Wed Nov 13 19:29:32 2002
@@ -1806,7 +1806,7 @@
 int push_journal_writer(char *w) ;
 int pop_journal_writer(int windex) ;
 int journal_transaction_should_end(struct reiserfs_transaction_handle *, int) ;
-int reiserfs_restart_transaction(struct reiserfs_transaction_handle *) ;
+int reiserfs_restart_transaction(struct reiserfs_transaction_handle *, int) ;
 int reiserfs_in_journal(struct super_block *p_s_sb, kdev_t dev, int bmap_nr, int bit_nr, int size, int searchall, unsigned int *next) ;
 int journal_begin(struct reiserfs_transaction_handle *, struct super_block *p_s_sb, unsigned long) ;
 


^ permalink raw reply	[flat|nested] 21+ messages in thread

* Re: Oops with in nfsd - 2.4.19-pre6
  2002-11-13 20:23         ` Chris Mason
  2002-11-14  2:33           ` Chris Mason
@ 2002-11-14 16:04           ` Philippe Gramoullé
  2002-11-14 16:32             ` Chris Mason
  1 sibling, 1 reply; 21+ messages in thread
From: Philippe Gramoullé @ 2002-11-14 16:04 UTC (permalink / raw)
  To: Chris Mason; +Cc: reiserfs-list

On 13 Nov 2002 15:23:21 -0500
Chris Mason <mason@suse.com> wrote:

  |  > 
  |  > BTW, should i apply the patch posted on LKML ?
  |  > Subject:2.[45] fixes for design locking bug in wait_on_page/wait_on_buffer/get_request_wait
  |  
  |  No, wait for that fix to get into a kernel before using it.  There might
  |  still be small modifications, and andrea might find a few other places
  |  in the kernel with similar races.
  |  
  |  The bug results in io stalls, and the machines that do see them stall
  |  for between 10 minutes and an hour.  If you needed the patch you would
  |  have already been complaining ;-) 

Well, now that you mention this, i may also have it on my bug squash list :o)
When we feed some files to a MySQL server ( 4 way box, lots of threads), sometimes the
box does nothing , pretty much like andrea described it , so it may be that.

  |  The races do not cause corruptions or crashes of any kind.

Right no corruption, just mysql sitting there doing nothing.
Ok then i'll wait for the next version.
  |  

  |  > 
  |  > Let me know what are you plans.
  |  
  |  Jan and I will get an updated quota patch asap,

This is just great :o) Do you have a rough idea when it will be
available ? couple days ? couple weeks ?

  |  and I'll have my
  |  attempted fix for the transaction overflow ready for you against that
  |  set of patches in a few hours.

I already got your fixed and but it on 4 or 5 production boxes that triggered the bug.

So far,it's been running fine :o) but without quotas ;o))

Thanks much

Philippe

^ permalink raw reply	[flat|nested] 21+ messages in thread

* Re: Oops with in nfsd - 2.4.19-pre6
  2002-11-14 16:04           ` Oops with in nfsd - 2.4.19-pre6 Philippe Gramoullé
@ 2002-11-14 16:32             ` Chris Mason
  2002-11-14 17:41               ` Philippe Gramoullé
  0 siblings, 1 reply; 21+ messages in thread
From: Chris Mason @ 2002-11-14 16:32 UTC (permalink / raw)
  To: Philippe Gramoullé; +Cc: reiserfs-list, jack

On Thu, 2002-11-14 at 11:04, Philippe Gramoullé wrote:
> On 13 Nov 2002 15:23:21 -0500
> Chris Mason <mason@suse.com> wrote:
> 
>   |  > 
>   |  > BTW, should i apply the patch posted on LKML ?
>   |  > Subject:2.[45] fixes for design locking bug in wait_on_page/wait_on_buffer/get_request_wait
>   |  
>   |  No, wait for that fix to get into a kernel before using it.  There might
>   |  still be small modifications, and andrea might find a few other places
>   |  in the kernel with similar races.
>   |  
>   |  The bug results in io stalls, and the machines that do see them stall
>   |  for between 10 minutes and an hour.  If you needed the patch you would
>   |  have already been complaining ;-) 
> 
> Well, now that you mention this, i may also have it on my bug squash list :o)
> When we feed some files to a MySQL server ( 4 way box, lots of threads), sometimes the
> box does nothing , pretty much like andrea described it , so it may be that.

Ok, that does sound like the stalling bug, especially since you've got 4
cpus.  You can make it much less likely to trigger by lowering the
threshold for where bdflush jumps in, but andrea's final patch should
show up shortly.

>   |  
>   |  Jan and I will get an updated quota patch asap,
> 
> This is just great :o) Do you have a rough idea when it will be
> available ? couple days ? couple weeks ?

Now that I've looked at it, I think the only reject is that parisc-32
already has one hunk applied, and x86-64 needs to have quotav2 support
added.  Neither one affects i386, but I'll do some tests here.  Jan, am
I missing something?

> 
> 
>   |  and I'll have my
>   |  attempted fix for the transaction overflow ready for you against that
>   |  set of patches in a few hours.
> 
> I already got your fixed and but it on 4 or 5 production boxes that triggered the bug.
> 
> So far,it's been running fine :o) but without quotas ;o))

Ok, it survived heavier load here overnight.

-chris


^ permalink raw reply	[flat|nested] 21+ messages in thread

* Re: Oops with in nfsd - 2.4.19-pre6
  2002-11-14 16:32             ` Chris Mason
@ 2002-11-14 17:41               ` Philippe Gramoullé
  2002-11-14 17:46                 ` Chris Mason
  0 siblings, 1 reply; 21+ messages in thread
From: Philippe Gramoullé @ 2002-11-14 17:41 UTC (permalink / raw)
  To: Chris Mason; +Cc: reiserfs-list, jack

On 14 Nov 2002 11:32:51 -0500
Chris Mason <mason@suse.com> wrote:

  |  > Well, now that you mention this, i may also have it on my bug squash list :o)
  |  > When we feed some files to a MySQL server ( 4 way box, lots of threads), sometimes the
  |  > box does nothing , pretty much like andrea described it , so it may be that.
  |  
  |  Ok, that does sound like the stalling bug, especially since you've got 4
  |  cpus.

That's what i thought too as it doesn't show up as much on other smaller boxes.

  |   You can make it much less likely to trigger by lowering the
  |  threshold for where bdflush jumps in,

Ok.

  |  but andrea's final patch should show up shortly.

Excellent.

  |  
  |  >   |  
  |  >   |  Jan and I will get an updated quota patch asap,
  |  > 
  |  > This is just great :o) Do you have a rough idea when it will be
  |  > available ? couple days ? couple weeks ?
  |  
  |  Now that I've looked at it, I think the only reject is that parisc-32
  |  already has one hunk applied, and x86-64 needs to have quotav2 support
  |  added.  Neither one affects i386, but I'll do some tests here.  Jan, am
  |  I missing something?

There was indeed some rejects in parisc-32 but there were lot more than that.
Rejects were when applying reiserfs-quota-22.diff.
(using vanilla 2.4.20rc1 + ftp.suse.com:/pub/people/mason/patches/reiserfs/quota-2.4/2.4.19 quota
patches)

  |  > 
  |  > So far,it's been running fine :o) but without quotas ;o))
  |  
  |  Ok, it survived heavier load here overnight.

Excellent, moreover that i'm on duty call this week ;o)

  |  
  |  -chris
  |  

Thanks much for your help.

Philippe.

^ permalink raw reply	[flat|nested] 21+ messages in thread

* Re: Oops with in nfsd - 2.4.19-pre6
  2002-11-14 17:41               ` Philippe Gramoullé
@ 2002-11-14 17:46                 ` Chris Mason
  0 siblings, 0 replies; 21+ messages in thread
From: Chris Mason @ 2002-11-14 17:46 UTC (permalink / raw)
  To: Philippe Gramoullé; +Cc: reiserfs-list, jack

On Thu, 2002-11-14 at 12:41, Philippe Gramoullé wrote:

>   |  Now that I've looked at it, I think the only reject is that parisc-32
>   |  already has one hunk applied, and x86-64 needs to have quotav2 support
>   |  added.  Neither one affects i386, but I'll do some tests here.  Jan, am
>   |  I missing something?
> 
> There was indeed some rejects in parisc-32 but there were lot more than that.
> Rejects were when applying reiserfs-quota-22.diff.
> (using vanilla 2.4.20rc1 + ftp.suse.com:/pub/people/mason/patches/reiserfs/quota-2.4/2.4.19 quota
> patches)

Those are conflicts between the reiserfs quota and the data logging. 
But, I already have data-logging quota patches, so I'm only really
worried about Jan's base quota patch.

-chris




^ permalink raw reply	[flat|nested] 21+ messages in thread

* Data-Logging Progess?!  (was: Oops with in nfsd - 2.4.19-pre6)
  2002-11-14  2:33           ` Chris Mason
@ 2002-11-27 20:54             ` Manuel Krause
  2002-11-28 10:18               ` Philippe Gramoullé
  0 siblings, 1 reply; 21+ messages in thread
From: Manuel Krause @ 2002-11-27 20:54 UTC (permalink / raw)
  To: Chris Mason; +Cc: reiserfs-list

On 11/14/2002 03:33 AM, Chris Mason wrote:
> On Wed, 2002-11-13 at 15:23, Chris Mason wrote:
> 
>>>Being unable to run the quota anyway with 2.4.19-pre6,as it would make the filer crash, i upgraded to
>>>2.4.20rc1 + data-logging patches and mounted the fs witn noatime,nodiratime and data=journal
>>
>>Ok, I'll make a patch against that.
> 
> Ok, I'm sending a beta of the patch, it has been through light testing. 
> I'm going to run extended tests overnight (in data=journal mode for
> Philippe), but I wanted to give Philippe the chance to use this on his
> own non-critical machines as well.
> 
> The patch is fairly simple, but the code involved can be very subtle to
> I'm trying to be as cautious as possible.
> 
> This is against 2.4.20-rc1 + the data logging code.  It should fix
> possible transaction overflows in the journal by pre-reserving space in
> the transaction during unbounded operations (truncate and hole
> creation).
> 
> It also shows that I owe namesys 1 beer, because it makes hole creation
> a whole lot faster (I've been blaming the balancing code for poor hole
> creation speed).
> 
> -chris
> 
> [possible-transaction-overflow-fix.diff]

Hi!

Are there any news regarding the data-logging code? Issues that justify 
the "experimental" status (e.g. root-fs mounting only with 
rootflags=data=...) or are there new fixes?!

Originally I wondered why the patch you, Chris, sent to the list isn't 
on your ftp servers directory for kernel 2.4.20 until now.

My late questions regarding this patch:
Does this patch make sense for people _not_ using
a) nfs
b) quota
c) data=journal
at all ?!

I applied this patch to 2.4.20-rc[1,3] since you posted it and it works 
well for me. I only use the data=ordered mode and nothing of [a,b,c].


Thanks,

Manuel.


^ permalink raw reply	[flat|nested] 21+ messages in thread

* Re: Data-Logging Progess?!  (was: Oops with in nfsd - 2.4.19-pre6)
  2002-11-27 20:54             ` Data-Logging Progess?! (was: Oops with in nfsd - 2.4.19-pre6) Manuel Krause
@ 2002-11-28 10:18               ` Philippe Gramoullé
  2002-11-28 19:23                 ` Manuel Krause
  0 siblings, 1 reply; 21+ messages in thread
From: Philippe Gramoullé @ 2002-11-28 10:18 UTC (permalink / raw)
  To: Manuel Krause; +Cc: reiserfs-list, mason

Hi Manuel,

We've been using the journal overflow patch sent by Chris since Nov 14th, on a busy
file server and up to this weekend it proved to solve our problems.

But this weekend we experienced 2 crashes on this server so there still might
be some problems to be fixed. I guess that this justifies the "experimental" status
of this patch.

In our case , we use :

a) NFS
b) quotas ( as soon as a patch will be ready for this kernel version )
c) data=journal

Thanks,

Philippe

On Wed, 27 Nov 2002 21:54:14 +0100
Manuel Krause <manuel.krause@mb.tu-ilmenau.de> wrote:

  |  Hi!
  |  
  |   Are there any news regarding the data-logging code? Issues that justify 
  |   the "experimental" status (e.g. root-fs mounting only with 
  |   rootflags=data=...) or are there new fixes?!
  |  
  |   Originally I wondered why the patch you, Chris, sent to the list isn't 
  |   on your ftp servers directory for kernel 2.4.20 until now.
  |  
  |   My late questions regarding this patch:
  |   Does this patch make sense for people _not_ using
  |   a) nfs
  |   b) quota
  |   c) data=journal
  |   at all ?!
  |  
  |   I applied this patch to 2.4.20-rc[1,3] since you posted it and it works 
  |   well for me. I only use the data=ordered mode and nothing of [a,b,c].
  |  
  |  
  |   Thanks,
  |  
  |   Manuel.

^ permalink raw reply	[flat|nested] 21+ messages in thread

* Re: Data-Logging Progess?!  (was: Oops with in nfsd - 2.4.19-pre6)
  2002-11-28 10:18               ` Philippe Gramoullé
@ 2002-11-28 19:23                 ` Manuel Krause
  0 siblings, 0 replies; 21+ messages in thread
From: Manuel Krause @ 2002-11-28 19:23 UTC (permalink / raw)
  To: Philippe Gramoullé; +Cc: Chris Mason, reiserfs-list

On 11/28/2002 11:18 AM, Philippe Gramoullé wrote:
> Hi Manuel,
> 
> We've been using the journal overflow patch sent by Chris since Nov 14th, on a busy
> file server and up to this weekend it proved to solve our problems.
> 
> But this weekend we experienced 2 crashes on this server so there still might
> be some problems to be fixed. I guess that this justifies the "experimental" status
> of this patch.
> 
> In our case , we use :
> 
> a) NFS
> b) quotas ( as soon as a patch will be ready for this kernel version )
> c) data=journal
> 
> Thanks,
> 
> Philippe
> 

Hi Philippe,

Thank you for your reply.

Did you try other kernels than 2.4.20-rc1? Maybe -rc3?

I remember having had some random ugly stalls while deleting via 
KDE2.2.2 Konqueror with -rc1 that seem to have disappered with -rc3.

IIRC I once recreated my local kernel sources via a shell script (simple 
file copy from another disk), patched them wrongly and then stalled 
while deleting the mess via Konqueror. Everything was working except for 
this "Delete files" window. After rebooting the delete worked.

I don't know if exactly this was related to some possible race 
conditions Chris mentioned in one of his previous mails.

Thanks,

Manuel.

^ permalink raw reply	[flat|nested] 21+ messages in thread

end of thread, other threads:[~2002-11-28 19:23 UTC | newest]

Thread overview: 21+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2002-10-31 21:08 Oops with in nfsd - 2.4.19-pre6 JP Howard
2002-10-31 21:30 ` Chris Mason
2002-11-01 14:46   ` Christopher Barry
2002-11-12 17:29   ` Philippe Gramoullé
2002-11-13 18:14     ` Chris Mason
2002-11-13 18:40       ` Philippe Gramoullé
2002-11-13 20:23         ` Chris Mason
2002-11-14  2:33           ` Chris Mason
2002-11-27 20:54             ` Data-Logging Progess?! (was: Oops with in nfsd - 2.4.19-pre6) Manuel Krause
2002-11-28 10:18               ` Philippe Gramoullé
2002-11-28 19:23                 ` Manuel Krause
2002-11-14 16:04           ` Oops with in nfsd - 2.4.19-pre6 Philippe Gramoullé
2002-11-14 16:32             ` Chris Mason
2002-11-14 17:41               ` Philippe Gramoullé
2002-11-14 17:46                 ` Chris Mason
  -- strict thread matches above, loose matches on Subject: below --
2002-10-31 22:12 JP Howard
2002-10-29 14:59 Philippe Gramoullé
2002-10-29 15:14 ` Oleg Drokin
2002-10-29 15:20   ` Philippe Gramoullé
2002-10-29 15:26     ` Oleg Drokin
2002-10-31 20:38     ` Chris Mason

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.