ext3 corruption

public inbox for linux-kernel@vger.kernel.org
 help / color / mirror / Atom feed

* ext3 corruption
@ 2002-07-12 15:32 Alec Smith
  2002-07-12 15:52 ` Russell King
  2002-07-12 16:02 ` Alan Cox
  0 siblings, 2 replies; 49+ messages in thread
From: Alec Smith @ 2002-07-12 15:32 UTC (permalink / raw)
  To: linux-kernel; +Cc: ext3-users

Hello,

Over the last month or so, I've noticed the following error showing up
repeatedly in my system logs under kernel 2.4.18-ac3 and more recently
under 2.4.19-rc1:

EXT3-fs error (device ide0(3,3)) in ext3_new_inode: error 28

I've now been able to capture the following Oops before the system went
down entirely:

Assertion failure in do_get_write_access() at transaction.c:611:
"!(((jh2bh(jh))->b_state & (1UL << BH_Lock)) != 0)"
kernel BUG at transaction.c:611!
invalid operand: 0000
CPU:    0
EIP:    0010:[<c015b12e>]    Not tainted
EFLAGS: 00010282
eax: 00000078   ebx: ddadd294   ecx: 00000004   edx: ddb0ff64
esi: ddadd200   edi: dec5d920   ebp: ddadd200   esp: d28dfe70
ds: 0018   es: 0018   ss: 0018
Process sendmail (pid: 21193, stackpage=d28df000)
Stack: c01f7460 c01f5969 c01f58d7 00000263 c01f94a0 00000000 00000000
cbf3b3c0
       ddadd294 ddadd200 dec5d920 d4a82730 c015b506 dec5d920 d4a82730
00000000
       ddd9acc0 ddadd000 dec5d920 c4cbdc20 c015744d dec5d920 ddd9acc0
00000000
Call Trace: [<c015b506>] [<c015744d>] [<c0155b04>] [<c0155b15>]
[<c0155c07>]
   [<c0157a95>] [<c013b5c6>] [<c013b6a9>] [<c0111bc0>] [<c01087eb>]

Code: 0f 0b 63 02 d7 58 1f c0 83 c4 14 8b 4c 24 20 bb e2 ff ff ff


Any help or patches would be greatly appreciated. I'd be glad to provide
more information if needed.


Alec


^ permalink raw reply	[flat|nested] 49+ messages in thread

* Re: ext3 corruption
  2002-07-12 15:32 Alec Smith
@ 2002-07-12 15:52 ` Russell King
  2002-07-12 19:51   ` Andrew Morton
  2002-07-12 16:02 ` Alan Cox
  1 sibling, 1 reply; 49+ messages in thread
From: Russell King @ 2002-07-12 15:52 UTC (permalink / raw)
  To: Alec Smith; +Cc: linux-kernel, ext3-users

On Fri, Jul 12, 2002 at 11:32:44AM -0400, Alec Smith wrote:
> Over the last month or so, I've noticed the following error showing up
> repeatedly in my system logs under kernel 2.4.18-ac3 and more recently
> under 2.4.19-rc1:
> 
> EXT3-fs error (device ide0(3,3)) in ext3_new_inode: error 28

Erm, that looks like the old "out of inodes, return -ENOSPC and mark the
filesystem read only" bug I found several months ago.  iirc, there have
been 3 recent issues (in the last three months) that I'm aware of:

1. running out of free blocks.
2. running out of free inodes.
3. i_nlink accounting goofup.

I've got patches from akpm for (1) and (3), but not (2).  I'd be nice to
have all three solved for 2.4.19.

-- 
Russell King (rmk@arm.linux.org.uk)                The developer of ARM Linux
             http://www.arm.linux.org.uk/personal/aboutme.html

^ permalink raw reply	[flat|nested] 49+ messages in thread

* Re: ext3 corruption
  2002-07-12 15:32 Alec Smith
  2002-07-12 15:52 ` Russell King
@ 2002-07-12 16:02 ` Alan Cox
  1 sibling, 0 replies; 49+ messages in thread
From: Alan Cox @ 2002-07-12 16:02 UTC (permalink / raw)
  To: Alec Smith; +Cc: linux-kernel, ext3-users

> Over the last month or so, I've noticed the following error showing up
> repeatedly in my system logs under kernel 2.4.18-ac3 and more recently
> under 2.4.19-rc1:

Force an fsck on the file system firstly

^ permalink raw reply	[flat|nested] 49+ messages in thread

* Re: ext3 corruption
       [not found] <Pine.LNX.4.33.0207121337500.8654-100000@coffee.psychology.mcmaster.ca>
@ 2002-07-12 19:05 ` Alec Smith
  2002-07-12 20:11   ` Andrew Morton
  0 siblings, 1 reply; 49+ messages in thread
From: Alec Smith @ 2002-07-12 19:05 UTC (permalink / raw)
  To: Mark Hahn; +Cc: ext3-users, linux-kernel

Here's the ksymoops output. Note that the kernel is entirely static --
I'm not using anything as a module. Ksymoops is complaining about the
symbol locations, however the defaults are in fact correct for the system.

[root@host155]~$ ksymoops < oops.txt
ksymoops 2.4.1 on i686 2.4.19-rc1.  Options used
     -V (default)
     -k /proc/ksyms (default)
     -l /proc/modules (default)
     -o /lib/modules/2.4.19-rc1/ (default)
     -m /boot/System.map-2.4.19-rc1 (default)

Warning: You did not tell me where to find symbol information.  I will
assume that the log matches the kernel and modules that are running
right now and I'll use the default options above for symbol resolution.
If the current kernel and/or modules do not match the log, you can get
more accurate output by telling me the kernel version and where to find
map, modules, ksyms etc.  ksymoops -h explains the options.

No modules in ksyms, skipping objects
Warning (read_lsmod): no symbols in lsmod, is /proc/modules a valid lsmod
file?
kernel BUG at transaction.c:611!
invalid operand: 0000
CPU:    0
EIP:    0010:[<c015b12e>]    Not tainted
Using defaults from ksymoops -t elf32-i386 -a i386
EFLAGS: 00010282
eax: 00000078   ebx: ddadd294   ecx: 00000004   edx: ddb0ff64
esi: ddadd200   edi: dec5d920   ebp: ddadd200   esp: d28dfe70
ds: 0018   es: 0018   ss: 0018
Process sendmail (pid: 21193, stackpage=d28df000)
Stack: c01f7460 c01f5969 c01f58d7 00000263 c01f94a0 00000000 00000000
cbf3b3c0
       ddadd294 ddadd200 dec5d920 d4a82730 c015b506 dec5d920 d4a82730
00000000
       ddd9acc0 ddadd000 dec5d920 c4cbdc20 c015744d dec5d920 ddd9acc0
00000000
Call Trace: [<c015b506>] [<c015744d>] [<c0155b04>] [<c0155b15>]
[<c0155c07>]
   [<c0157a95>] [<c013b5c6>] [<c013b6a9>] [<c0111bc0>] [<c01087eb>]
Code: 0f 0b 63 02 d7 58 1f c0 83 c4 14 8b 4c 24 20 bb e2 ff ff ff

>>EIP; c015b12e <do_get_write_access+1ce/570>   <=====
Trace; c015b506 <journal_get_write_access+36/60>
Trace; c015744d <ext3_orphan_add+8d/1c0>
Trace; c0155b04 <ext3_mark_iloc_dirty+24/50>
Trace; c0155b15 <ext3_mark_iloc_dirty+35/50>
Trace; c0155c07 <ext3_mark_inode_dirty+27/40>
Trace; c0157a95 <ext3_unlink+135/190>
Trace; c013b5c6 <vfs_unlink+106/150>
Trace; c013b6a9 <sys_unlink+99/100>
Trace; c0111bc0 <do_page_fault+0/4cb>
Trace; c01087eb <system_call+33/38>
Code;  c015b12e <do_get_write_access+1ce/570>
00000000 <_EIP>:
Code;  c015b12e <do_get_write_access+1ce/570>   <=====
   0:   0f 0b                     ud2a      <=====
Code;  c015b130 <do_get_write_access+1d0/570>
   2:   63 02                     arpl   %ax,(%edx)
Code;  c015b132 <do_get_write_access+1d2/570>
   4:   d7                        xlat   %ds:(%ebx)
Code;  c015b133 <do_get_write_access+1d3/570>
   5:   58                        pop    %eax
Code;  c015b134 <do_get_write_access+1d4/570>
   6:   1f                        pop    %ds
Code;  c015b135 <do_get_write_access+1d5/570>
   7:   c0 83 c4 14 8b 4c 24      rolb   $0x24,0x4c8b14c4(%ebx)
Code;  c015b13c <do_get_write_access+1dc/570>
   e:   20 bb e2 ff ff ff         and    %bh,0xffffffe2(%ebx)


On Fri, 12 Jul 2002, Mark Hahn wrote:

> > Over the last month or so, I've noticed the following error showing up
> > repeatedly in my system logs under kernel 2.4.18-ac3 and more recently
> > under 2.4.19-rc1:
>
> even though you have a nice assertion failure,
> you probably still need to follow the faq on decoding the oops.
>
>
> >
> > EXT3-fs error (device ide0(3,3)) in ext3_new_inode: error 28
> >
> > I've now been able to capture the following Oops before the system went
> > down entirely:
> >
> > Assertion failure in do_get_write_access() at transaction.c:611:
> > "!(((jh2bh(jh))->b_state & (1UL << BH_Lock)) != 0)"
> > kernel BUG at transaction.c:611!
> > invalid operand: 0000
> > CPU:    0
> > EIP:    0010:[<c015b12e>]    Not tainted
> > EFLAGS: 00010282
> > eax: 00000078   ebx: ddadd294   ecx: 00000004   edx: ddb0ff64
> > esi: ddadd200   edi: dec5d920   ebp: ddadd200   esp: d28dfe70
> > ds: 0018   es: 0018   ss: 0018
> > Process sendmail (pid: 21193, stackpage=d28df000)
> > Stack: c01f7460 c01f5969 c01f58d7 00000263 c01f94a0 00000000 00000000
> > cbf3b3c0
> >        ddadd294 ddadd200 dec5d920 d4a82730 c015b506 dec5d920 d4a82730
> > 00000000
> >        ddd9acc0 ddadd000 dec5d920 c4cbdc20 c015744d dec5d920 ddd9acc0
> > 00000000
> > Call Trace: [<c015b506>] [<c015744d>] [<c0155b04>] [<c0155b15>]
> > [<c0155c07>]
> >    [<c0157a95>] [<c013b5c6>] [<c013b6a9>] [<c0111bc0>] [<c01087eb>]
> >
> > Code: 0f 0b 63 02 d7 58 1f c0 83 c4 14 8b 4c 24 20 bb e2 ff ff ff
> >
> >
> > Any help or patches would be greatly appreciated. I'd be glad to provide
> > more information if needed.
> >
> >
> > Alec
> >
> > -
> > To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
> > the body of a message to majordomo@vger.kernel.org
> > More majordomo info at  http://vger.kernel.org/majordomo-info.html
> > Please read the FAQ at  http://www.tux.org/lkml/
> >
>
> --
> operator may differ from spokesperson.	            hahn@mcmaster.ca
>                                               http://hahn.mcmaster.ca/~hahn
>


^ permalink raw reply	[flat|nested] 49+ messages in thread

* Re: ext3 corruption
  2002-07-12 15:52 ` Russell King
@ 2002-07-12 19:51   ` Andrew Morton
  0 siblings, 0 replies; 49+ messages in thread
From: Andrew Morton @ 2002-07-12 19:51 UTC (permalink / raw)
  To: Russell King; +Cc: Alec Smith, linux-kernel, ext3-users, Marcelo Tosatti

Russell King wrote:
> 
> On Fri, Jul 12, 2002 at 11:32:44AM -0400, Alec Smith wrote:
> > Over the last month or so, I've noticed the following error showing up
> > repeatedly in my system logs under kernel 2.4.18-ac3 and more recently
> > under 2.4.19-rc1:
> >
> > EXT3-fs error (device ide0(3,3)) in ext3_new_inode: error 28
> 
> Erm, that looks like the old "out of inodes, return -ENOSPC and mark the
> filesystem read only" bug I found several months ago.  iirc, there have
> been 3 recent issues (in the last three months) that I'm aware of:
> 
> 1. running out of free blocks.
> 2. running out of free inodes.
> 3. i_nlink accounting goofup.
> 
> I've got patches from akpm for (1) and (3), but not (2).  I'd be nice to
> have all three solved for 2.4.19.

Whoa.  Thanks for the reminder.  Fixed in 2.5, fixed in ext3
CVS, forgotten in Linux.

Marcelo, please.  The patch makes ext3 return -ENOSPC when it
runs out of inodes rather than remounting the fs readonly
or forcing a panic.


--- 2.4.19-rc1/fs/ext3/ialloc.c~ext3-ialloc	Fri Jul 12 12:47:58 2002
+++ 2.4.19-rc1-akpm/fs/ext3/ialloc.c	Fri Jul 12 12:48:06 2002
@@ -392,7 +392,7 @@ repeat:
 
 	err = -ENOSPC;
 	if (!gdp)
-		goto fail;
+		goto out;
 
 	err = -EIO;
 	bitmap_nr = load_inode_bitmap (sb, i);
@@ -523,9 +523,10 @@ repeat:
 	return inode;
 
 fail:
+	ext3_std_error(sb, err);
+out:
 	unlock_super(sb);
 	iput(inode);
-	ext3_std_error(sb, err);
 	return ERR_PTR(err);
 }
 

-

^ permalink raw reply	[flat|nested] 49+ messages in thread

* Re: ext3 corruption
  2002-07-12 19:05 ` Alec Smith
@ 2002-07-12 20:11   ` Andrew Morton
  0 siblings, 0 replies; 49+ messages in thread
From: Andrew Morton @ 2002-07-12 20:11 UTC (permalink / raw)
  To: Alec Smith; +Cc: Mark Hahn, ext3-users, linux-kernel

Alec Smith wrote:
> 
> ...
> kernel BUG at transaction.c:611!

Are you using data=journal there?

This is a sync with the current ext3 dev tree.  SHuld fix
it up.


--- ext3-linus_tree/fs/buffer.c	Wed Jun  5 00:18:36 2002
+++ ext3-1_0-branch/fs/buffer.c	Mon May 13 00:01:14 2002
@@ -1746,9 +1746,14 @@
 	}
 
 	/* Stage 3: start the IO */
-	for (i = 0; i < nr; i++)
-		submit_bh(READ, arr[i]);
-
+	for (i = 0; i < nr; i++) {
+		struct buffer_head * bh = arr[i];
+		if (buffer_uptodate(bh))
+			end_buffer_io_async(bh, 1);
+		else
+			submit_bh(READ, bh);
+	}
+	
 	return 0;
 }
 
--- ext3-linus_tree/fs/ext3/file.c	Wed Jun  5 00:18:36 2002
+++ ext3-1_0-branch/fs/ext3/file.c	Wed Jun  5 00:17:25 2002
@@ -61,19 +61,52 @@
 static ssize_t
 ext3_file_write(struct file *file, const char *buf, size_t count, loff_t *ppos)
 {
+	int ret, err;
 	struct inode *inode = file->f_dentry->d_inode;
 
-	/*
-	 * Nasty: if the file is subject to synchronous writes then we need
-	 * to force generic_osync_inode() to call ext3_write_inode().
-	 * We do that by marking the inode dirty.  This adds much more
-	 * computational expense than we need, but we're going to sync
-	 * anyway.
-	 */
-	if (IS_SYNC(inode) || (file->f_flags & O_SYNC))
-		mark_inode_dirty(inode);
+	ret = generic_file_write(file, buf, count, ppos);
 
-	return generic_file_write(file, buf, count, ppos);
+	/* Skip file flushing code if there was an error, or if nothing
+	   was written. */
+	if (ret <= 0)
+		return ret;
+	
+	/* If the inode is IS_SYNC, or is O_SYNC and we are doing
+           data-journaling, then we need to make sure that we force the
+           transaction to disk to keep all metadata uptodate
+           synchronously. */
+
+	if (file->f_flags & O_SYNC) {
+		/* If we are non-data-journaled, then the dirty data has
+                   already been flushed to backing store by
+                   generic_osync_inode, and the inode has been flushed
+                   too if there have been any modifications other than
+                   mere timestamp updates.
+		   
+		   Open question --- do we care about flushing
+		   timestamps too if the inode is IS_SYNC? */
+		if (!ext3_should_journal_data(inode))
+			return ret;
+
+		goto force_commit;
+	}
+
+	/* So we know that there has been no forced data flush.  If the
+           inode is marked IS_SYNC, we need to force one ourselves. */
+	if (!IS_SYNC(inode))
+		return ret;
+	
+	/* Open question #2 --- should we force data to disk here too?
+           If we don't, the only impact is that data=writeback
+           filesystems won't flush data to disk automatically on
+           IS_SYNC, only metadata (but historically, that is what ext2
+           has done.) */
+	
+force_commit:
+	err = ext3_force_commit(inode->i_sb);
+	if (err) 
+		return err;
+	return ret;
 }
 
 struct file_operations ext3_file_operations = {
--- ext3-linus_tree/fs/ext3/fsync.c	Wed Jun  5 00:18:36 2002
+++ ext3-1_0-branch/fs/ext3/fsync.c	Mon May 13 00:01:14 2002
@@ -62,7 +62,12 @@
 	 * we'll end up waiting on them in commit.
 	 */
 	ret = fsync_inode_buffers(inode);
-	ret |= fsync_inode_data_buffers(inode);
+
+	/* In writeback node, we need to force out data buffers too.  In
+	 * the other modes, ext3_force_commit takes care of forcing out
+	 * just the right data blocks. */
+	if (test_opt(inode->i_sb, DATA_FLAGS) == EXT3_MOUNT_WRITEBACK_DATA)
+		ret |= fsync_inode_data_buffers(inode);
 
 	ext3_force_commit(inode->i_sb);
 
--- ext3-linus_tree/fs/ext3/ialloc.c	Wed Jun  5 00:18:36 2002
+++ ext3-1_0-branch/fs/ext3/ialloc.c	Mon May 13 00:01:14 2002
@@ -392,7 +392,7 @@
 
 	err = -ENOSPC;
 	if (!gdp)
-		goto fail;
+		goto out;
 
 	err = -EIO;
 	bitmap_nr = load_inode_bitmap (sb, i);
@@ -523,9 +523,10 @@
 	return inode;
 
 fail:
+	ext3_std_error(sb, err);
+out:
 	unlock_super(sb);
 	iput(inode);
-	ext3_std_error(sb, err);
 	return ERR_PTR(err);
 }
 
--- ext3-linus_tree/fs/ext3/inode.c	Wed Jun  5 00:18:37 2002
+++ ext3-1_0-branch/fs/ext3/inode.c	Wed Jun  5 00:17:25 2002
@@ -412,6 +412,7 @@
 	return NULL;
 
 changed:
+	brelse(bh);
 	*err = -EAGAIN;
 	goto no_block;
 failure:
@@ -948,11 +949,13 @@
 }
 
 static int walk_page_buffers(	handle_t *handle,
+				struct inode *inode,
 				struct buffer_head *head,
 				unsigned from,
 				unsigned to,
 				int *partial,
 				int (*fn)(	handle_t *handle,
+						struct inode *inode,
 						struct buffer_head *bh))
 {
 	struct buffer_head *bh;
@@ -970,7 +973,7 @@
 				*partial = 1;
 			continue;
 		}
-		err = (*fn)(handle, bh);
+		err = (*fn)(handle, inode, bh);
 		if (!ret)
 			ret = err;
 	}
@@ -1003,7 +1006,7 @@
  * write.  
  */
 
-static int do_journal_get_write_access(handle_t *handle, 
+static int do_journal_get_write_access(handle_t *handle, struct inode *inode,
 				       struct buffer_head *bh)
 {
 	return ext3_journal_get_write_access(handle, bh);
@@ -1029,7 +1032,7 @@
 		goto prepare_write_failed;
 
 	if (ext3_should_journal_data(inode)) {
-		ret = walk_page_buffers(handle, page->buffers,
+		ret = walk_page_buffers(handle, inode, page->buffers,
 				from, to, NULL, do_journal_get_write_access);
 		if (ret) {
 			/*
@@ -1050,24 +1053,32 @@
 	return ret;
 }
 
-static int journal_dirty_sync_data(handle_t *handle, struct buffer_head *bh)
+static int journal_dirty_sync_data(handle_t *handle, struct inode *inode,
+				   struct buffer_head *bh)
 {
-	return ext3_journal_dirty_data(handle, bh, 0);
+	int ret = ext3_journal_dirty_data(handle, bh, 0);
+	if (bh->b_inode != inode)
+		buffer_insert_inode_data_queue(bh, inode);
+	return ret;
 }
 
 /*
  * For ext3_writepage().  We also brelse() the buffer to account for
  * the bget() which ext3_writepage() performs.
  */
-static int journal_dirty_async_data(handle_t *handle, struct buffer_head *bh)
+static int journal_dirty_async_data(handle_t *handle, struct inode *inode, 
+				    struct buffer_head *bh)
 {
 	int ret = ext3_journal_dirty_data(handle, bh, 1);
+	if (bh->b_inode != inode)
+		buffer_insert_inode_data_queue(bh, inode);
 	__brelse(bh);
 	return ret;
 }
 
 /* For commit_write() in data=journal mode */
-static int commit_write_fn(handle_t *handle, struct buffer_head *bh)
+static int commit_write_fn(handle_t *handle, struct inode *inode, 
+			   struct buffer_head *bh)
 {
 	set_bit(BH_Uptodate, &bh->b_state);
 	return ext3_journal_dirty_metadata(handle, bh);
@@ -1102,7 +1113,7 @@
 		int partial = 0;
 		loff_t pos = ((loff_t)page->index << PAGE_CACHE_SHIFT) + to;
 
-		ret = walk_page_buffers(handle, page->buffers,
+		ret = walk_page_buffers(handle, inode, page->buffers,
 			from, to, &partial, commit_write_fn);
 		if (!partial)
 			SetPageUptodate(page);
@@ -1112,7 +1123,7 @@
 		EXT3_I(inode)->i_state |= EXT3_STATE_JDATA;
 	} else {
 		if (ext3_should_order_data(inode)) {
-			ret = walk_page_buffers(handle, page->buffers,
+			ret = walk_page_buffers(handle, inode, page->buffers,
 				from, to, NULL, journal_dirty_sync_data);
 		}
 		/* Be careful here if generic_commit_write becomes a
@@ -1194,7 +1205,8 @@
 	return generic_block_bmap(mapping,block,ext3_get_block);
 }
 
-static int bget_one(handle_t *handle, struct buffer_head *bh)
+static int bget_one(handle_t *handle, struct inode *inode, 
+		    struct buffer_head *bh)
 {
 	atomic_inc(&bh->b_count);
 	return 0;
@@ -1293,7 +1305,7 @@
 			create_empty_buffers(page,
 				inode->i_dev, inode->i_sb->s_blocksize);
 		page_buffers = page->buffers;
-		walk_page_buffers(handle, page_buffers, 0,
+		walk_page_buffers(handle, inode, page_buffers, 0,
 				PAGE_CACHE_SIZE, NULL, bget_one);
 	}
 
@@ -1311,7 +1323,7 @@
 
 	/* And attach them to the current transaction */
 	if (order_data) {
-		err = walk_page_buffers(handle, page_buffers,
+		err = walk_page_buffers(handle, inode, page_buffers,
 			0, PAGE_CACHE_SIZE, NULL, journal_dirty_async_data);
 		if (!ret)
 			ret = err;
--- ext3-linus_tree/fs/ext3/namei.c	Wed Jun  5 00:18:37 2002
+++ ext3-1_0-branch/fs/ext3/namei.c	Wed Jun  5 00:17:25 2002
@@ -354,8 +354,8 @@
 			 */
 			dir->i_mtime = dir->i_ctime = CURRENT_TIME;
 			dir->u.ext3_i.i_flags &= ~EXT3_INDEX_FL;
-			ext3_mark_inode_dirty(handle, dir);
 			dir->i_version = ++event;
+			ext3_mark_inode_dirty(handle, dir);
 			BUFFER_TRACE(bh, "call ext3_journal_dirty_metadata");
 			ext3_journal_dirty_metadata(handle, bh);
 			brelse(bh);
@@ -464,8 +464,8 @@
 		inode->i_op = &ext3_file_inode_operations;
 		inode->i_fop = &ext3_file_operations;
 		inode->i_mapping->a_ops = &ext3_aops;
-		ext3_mark_inode_dirty(handle, inode);
 		err = ext3_add_nondir(handle, dentry, inode);
+		ext3_mark_inode_dirty(handle, inode);
 	}
 	ext3_journal_stop(handle, dir);
 	return err;
@@ -489,8 +489,8 @@
 	err = PTR_ERR(inode);
 	if (!IS_ERR(inode)) {
 		init_special_inode(inode, mode, rdev);
-		ext3_mark_inode_dirty(handle, inode);
 		err = ext3_add_nondir(handle, dentry, inode);
+		ext3_mark_inode_dirty(handle, inode);
 	}
 	ext3_journal_stop(handle, dir);
 	return err;
@@ -933,8 +933,8 @@
 		inode->i_size = l-1;
 	}
 	inode->u.ext3_i.i_disksize = inode->i_size;
-	ext3_mark_inode_dirty(handle, inode);
 	err = ext3_add_nondir(handle, dentry, inode);
+	ext3_mark_inode_dirty(handle, inode);
 out_stop:
 	ext3_journal_stop(handle, dir);
 	return err;
@@ -970,8 +970,8 @@
 	ext3_inc_count(handle, inode);
 	atomic_inc(&inode->i_count);
 
-	ext3_mark_inode_dirty(handle, inode);
 	err = ext3_add_nondir(handle, dentry, inode);
+	ext3_mark_inode_dirty(handle, inode);
 	ext3_journal_stop(handle, dir);
 	return err;
 }
--- ext3-linus_tree/fs/ext3/super.c	Wed Jun  5 00:18:37 2002
+++ ext3-1_0-branch/fs/ext3/super.c	Mon May 13 00:01:15 2002
@@ -1589,8 +1589,10 @@
 		journal_t *journal = EXT3_SB(sb)->s_journal;
 
 		/* Now we set up the journal barrier. */
+		unlock_super(sb);
 		journal_lock_updates(journal);
 		journal_flush(journal);
+		lock_super(sb);
 
 		/* Journal blocked and flushed, clear needs_recovery flag. */
 		EXT3_CLEAR_INCOMPAT_FEATURE(sb, EXT3_FEATURE_INCOMPAT_RECOVER);
--- ext3-linus_tree/fs/jbd/journal.c	Wed Jun  5 00:18:37 2002
+++ ext3-1_0-branch/fs/jbd/journal.c	Mon May 13 00:01:15 2002
@@ -1488,6 +1488,49 @@
 	unlock_journal(journal);
 }
 
+
+/*
+ * Report any unexpected dirty buffers which turn up.  Normally those
+ * indicate an error, but they can occur if the user is running (say)
+ * tune2fs to modify the live filesystem, so we need the option of
+ * continuing as gracefully as possible.  #
+ *
+ * The caller should already hold the journal lock and
+ * journal_datalist_lock spinlock: most callers will need those anyway
+ * in order to probe the buffer's journaling state safely.
+ */
+void __jbd_unexpected_dirty_buffer(char *function, int line, 
+				 struct journal_head *jh)
+{
+	struct buffer_head *bh = jh2bh(jh);
+	int jlist;
+	
+	if (buffer_dirty(bh)) {
+		printk ("%sUnexpected dirty buffer encountered at "
+			"%s:%d (%s blocknr %lu)\n",
+			KERN_WARNING, function, line,
+			kdevname(bh->b_dev), bh->b_blocknr);
+#ifdef JBD_PARANOID_WRITES
+		J_ASSERT (!buffer_dirty(bh));
+#endif	
+		
+		/* If this buffer is one which might reasonably be dirty
+		 * --- ie. data, or not part of this journal --- then
+		 * we're OK to leave it alone, but otherwise we need to
+		 * move the dirty bit to the journal's own internal
+		 * JBDDirty bit. */
+		jlist = jh->b_jlist;
+		
+		if (jlist == BJ_Metadata || jlist == BJ_Reserved || 
+		    jlist == BJ_Shadow || jlist == BJ_Forget) {
+			if (atomic_set_buffer_clean(jh2bh(jh))) {
+				set_bit(BH_JBDDirty, &jh2bh(jh)->b_state);
+			}
+		}
+	}
+}
+
+
 int journal_blocks_per_page(struct inode *inode)
 {
 	return 1 << (PAGE_CACHE_SHIFT - inode->i_sb->s_blocksize_bits);
--- ext3-linus_tree/fs/jbd/transaction.c	Wed Jun  5 00:18:37 2002
+++ ext3-1_0-branch/fs/jbd/transaction.c	Wed Jun  5 00:17:26 2002
@@ -539,76 +539,67 @@
 static int
 do_get_write_access(handle_t *handle, struct journal_head *jh, int force_copy) 
 {
+	struct buffer_head *bh;
 	transaction_t *transaction = handle->h_transaction;
 	journal_t *journal = transaction->t_journal;
 	int error;
 	char *frozen_buffer = NULL;
 	int need_copy = 0;
-
+	int locked;
+	
 	jbd_debug(5, "buffer_head %p, force_copy %d\n", jh, force_copy);
 
 	JBUFFER_TRACE(jh, "entry");
 repeat:
+	bh = jh2bh(jh);
+
 	/* @@@ Need to check for errors here at some point. */
 
 	/*
-	 * AKPM: neither bdflush nor kupdate run with the BKL.   There's
-	 * nothing we can do to prevent them from starting writeout of a
-	 * BUF_DIRTY buffer at any time.  And checkpointing buffers are on
-	 * BUF_DIRTY.  So.  We no longer assert that the buffer is unlocked.
-	 *
-	 * However.  It is very wrong for us to allow ext3 to start directly
-	 * altering the ->b_data of buffers which may at that very time be
-	 * undergoing writeout to the client filesystem.  This can leave
-	 * the filesystem in an inconsistent, transient state if we crash.
-	 * So what we do is to steal the buffer if it is in checkpoint
-	 * mode and dirty.  The journal lock will keep out checkpoint-mode
-	 * state transitions within journal_remove_checkpoint() and the buffer
-	 * is locked to keep bdflush/kupdate/whoever away from it as well.
-	 *
 	 * AKPM: we have replaced all the lock_journal_bh_wait() stuff with a
 	 * simple lock_journal().  This code here will care for locked buffers.
 	 */
-	/*
-	 * The buffer_locked() || buffer_dirty() tests here are simply an
-	 * optimisation tweak.  If anyone else in the system decides to
-	 * lock this buffer later on, we'll blow up.  There doesn't seem
-	 * to be a good reason why they should do this.
-	 */
-	if (jh->b_cp_transaction &&
-	    (buffer_locked(jh2bh(jh)) || buffer_dirty(jh2bh(jh)))) {
+	locked = test_and_set_bit(BH_Lock, &bh->b_state);
+	if (locked) {
+		/* We can't reliably test the buffer state if we found
+		 * it already locked, so just wait for the lock and
+		 * retry. */
 		unlock_journal(journal);
-		lock_buffer(jh2bh(jh));
-		spin_lock(&journal_datalist_lock);
-		if (jh->b_cp_transaction && buffer_dirty(jh2bh(jh))) {
-			/* OK, we need to steal it */
-			JBUFFER_TRACE(jh, "stealing from checkpoint mode");
-			J_ASSERT_JH(jh, jh->b_next_transaction == NULL);
-			J_ASSERT_JH(jh, jh->b_frozen_data == NULL);
-
-			J_ASSERT(handle->h_buffer_credits > 0);
-			handle->h_buffer_credits--;
-
-			/* This will clear BH_Dirty and set BH_JBDDirty. */
-			JBUFFER_TRACE(jh, "file as BJ_Reserved");
-			__journal_file_buffer(jh, transaction, BJ_Reserved);
-
-			/* And pull it off BUF_DIRTY, onto BUF_CLEAN */
-			refile_buffer(jh2bh(jh));
+		__wait_on_buffer(bh);
+		lock_journal(journal);
+		goto repeat;
+	}
+	
+	/* We now hold the buffer lock so it is safe to query the buffer
+	 * state.  Is the buffer dirty? 
+	 * 
+	 * If so, there are two possibilities.  The buffer may be
+	 * non-journaled, and undergoing a quite legitimate writeback.
+	 * Otherwise, it is journaled, and we don't expect dirty buffers
+	 * in that state (the buffers should be marked JBD_Dirty
+	 * instead.)  So either the IO is being done under our own
+	 * control and this is a bug, or it's a third party IO such as
+	 * dump(8) (which may leave the buffer scheduled for read ---
+	 * ie. locked but not dirty) or tune2fs (which may actually have
+	 * the buffer dirtied, ugh.)  */
 
-			/*
-			 * The buffer is now hidden from bdflush.   It is
-			 * metadata against the current transaction.
-			 */
-			JBUFFER_TRACE(jh, "steal from cp mode is complete");
+	if (buffer_dirty(bh)) {
+		spin_lock(&journal_datalist_lock);
+		/* First question: is this buffer already part of the
+		 * current transaction or the existing committing
+		 * transaction? */
+		if (jh->b_transaction) {
+			J_ASSERT_JH(jh, jh->b_transaction == transaction || 
+				    jh->b_transaction == journal->j_committing_transaction);
+			if (jh->b_next_transaction)
+				J_ASSERT_JH(jh, jh->b_next_transaction == transaction);
+			JBUFFER_TRACE(jh, "Unexpected dirty buffer");
+			jbd_unexpected_dirty_buffer(jh);
 		}
 		spin_unlock(&journal_datalist_lock);
-		unlock_buffer(jh2bh(jh));
-		lock_journal(journal);
-		goto repeat;
 	}
 
-	J_ASSERT_JH(jh, !buffer_locked(jh2bh(jh)));
+	unlock_buffer(bh);
 
 	error = -EROFS;
 	if (is_handle_aborted(handle)) 
@@ -1912,8 +1903,29 @@
 	unlock_journal(journal);
 
 	if (!offset) {
-		if (!may_free || !try_to_free_buffers(page, 0))
+		if (!may_free || !try_to_free_buffers(page, 0)) {
+			if (!offset) {
+				/* We are still using the page, but only
+                                   because a transaction is pinning the
+                                   page.  Once it commits, we want to
+                                   encourage the page to be reaped as
+                                   quickly as possible. */
+				ClearPageReferenced(page);
+
+#if 0
+				/* Ugh, this is not exactly portable
+				   between VMs: we need a modular
+				   solution for this some day.. */
+				if (PageActive(page)) {
+					spin_lock(&pagemap_lru_lock);
+					del_page_from_active_list(page);
+					add_page_to_inactive_list(page);
+					spin_unlock(&pagemap_lru_lock);
+				}
+#endif
+			}
 			return 0;
+		}
 		J_ASSERT(page->buffers == NULL);
 	}
 	return 1;
@@ -1926,6 +1938,7 @@
 			transaction_t *transaction, int jlist)
 {
 	struct journal_head **list = 0;
+	int was_dirty = 0;
 
 	assert_spin_locked(&journal_datalist_lock);
 	
@@ -1936,13 +1949,24 @@
 	J_ASSERT_JH(jh, jh->b_transaction == transaction ||
 				jh->b_transaction == 0);
 
-	if (jh->b_transaction) {
-		if (jh->b_jlist == jlist)
-			return;
+	if (jh->b_transaction && jh->b_jlist == jlist)
+		return;
+	
+	/* The following list of buffer states needs to be consistent
+	 * with __jbd_unexpected_dirty_buffer()'s handling of dirty
+	 * state. */
+
+	if (jlist == BJ_Metadata || jlist == BJ_Reserved || 
+	    jlist == BJ_Shadow || jlist == BJ_Forget) {
+		if (atomic_set_buffer_clean(jh2bh(jh)) ||
+		    test_and_clear_bit(BH_JBDDirty, &jh2bh(jh)->b_state))
+			was_dirty = 1;
+	}
+
+	if (jh->b_transaction)
 		__journal_unfile_buffer(jh);
-	} else {
+	else
 		jh->b_transaction = transaction;
-	}
 
 	switch (jlist) {
 	case BJ_None:
@@ -1979,12 +2003,8 @@
 	__blist_add_buffer(list, jh);
 	jh->b_jlist = jlist;
 
-	if (jlist == BJ_Metadata || jlist == BJ_Reserved || 
-	    jlist == BJ_Shadow || jlist == BJ_Forget) {
-		if (atomic_set_buffer_clean(jh2bh(jh))) {
-			set_bit(BH_JBDDirty, &jh2bh(jh)->b_state);
-		}
-	}
+	if (was_dirty)
+		set_bit(BH_JBDDirty, &jh2bh(jh)->b_state);
 }
 
 void journal_file_buffer(struct journal_head *jh,
--- ext3-linus_tree/include/linux/ext3_fs.h	Wed Jun  5 00:18:37 2002
+++ ext3-1_0-branch/include/linux/ext3_fs.h	Wed Jun  5 00:17:27 2002
@@ -36,8 +36,8 @@
 /*
  * The second extended file system version
  */
-#define EXT3FS_DATE		"10 Jan 2002"
-#define EXT3FS_VERSION		"2.4-0.9.17"
+#define EXT3FS_DATE		"14 May 2002"
+#define EXT3FS_VERSION		"2.4-0.9.18"
 
 /*
  * Debug code
--- ext3-linus_tree/include/linux/jbd.h	Wed Jun  5 00:18:37 2002
+++ ext3-1_0-branch/include/linux/jbd.h	Mon May 13 00:01:15 2002
@@ -32,6 +32,14 @@
 
 #define journal_oom_retry 1
 
+/*
+ * Define JBD_PARANOID_WRITES to cause a kernel BUG() check if ext3
+ * finds a buffer unexpectedly dirty.  This is useful for debugging, but
+ * can cause spurious kernel panics if there are applications such as
+ * tune2fs modifying our buffer_heads behind our backs.
+ */
+#undef JBD_PARANOID_WRITES
+
 #ifdef CONFIG_JBD_DEBUG
 /*
  * Define JBD_EXPENSIVE_CHECKING to enable more expensive internal
@@ -730,6 +738,10 @@
 	schedule();						      \
 } while (1)
 
+extern void __jbd_unexpected_dirty_buffer(char *, int, struct journal_head *);
+#define jbd_unexpected_dirty_buffer(jh) \
+	__jbd_unexpected_dirty_buffer(__FUNCTION__, __LINE__, (jh))
+	
 /*
  * is_journal_abort
  *

^ permalink raw reply	[flat|nested] 49+ messages in thread

* ext3 corruption
@ 2006-07-13 20:32 Molle Bestefich
  2006-08-08 23:47 ` Molle Bestefich
  0 siblings, 1 reply; 49+ messages in thread
From: Molle Bestefich @ 2006-07-13 20:32 UTC (permalink / raw)
  To: linux-kernel

Hello

I'm not quite sure where the right place is to go with this, so now
I'm asking here.  Hope you can help.

I have a ~1TB filesystem that failed to mount today, the message is:

EXT3-fs error (device loop0): ext3_check_descriptors: Block bitmap for
group 2338 not in group (block 1607003381)!
EXT3-fs: group descriptors corrupted !

Yesterday it worked flawlessly.

What's the problem, and what's the best course of action?

(cc appreciated, as I'm not subscribed!)

^ permalink raw reply	[flat|nested] 49+ messages in thread

* ext3 corruption
  2006-07-13 20:32 ext3 corruption Molle Bestefich
@ 2006-08-08 23:47 ` Molle Bestefich
  2006-08-09  1:33   ` Sergio Monteiro Basto
                     ` (2 more replies)
  0 siblings, 3 replies; 49+ messages in thread
From: Molle Bestefich @ 2006-08-08 23:47 UTC (permalink / raw)
  To: linux-kernel

I have a ~1TB filesystem that fails to mount, the message is:

EXT3-fs error (device loop0): ext3_check_descriptors: Block bitmap for
group 2338 not in group (block 1607003381)!
EXT3-fs: group descriptors corrupted !

A day before, it worked flawlessly.

What could have happened, and what's the best course of action?

^ permalink raw reply	[flat|nested] 49+ messages in thread

* Re: ext3 corruption
  2006-08-08 23:47 ` Molle Bestefich
@ 2006-08-09  1:33   ` Sergio Monteiro Basto
  2006-08-09 10:36   ` Molle Bestefich
  2006-08-09 11:33   ` linux-os (Dick Johnson)
  2 siblings, 0 replies; 49+ messages in thread
From: Sergio Monteiro Basto @ 2006-08-09  1:33 UTC (permalink / raw)
  To: Molle Bestefich; +Cc: linux-kernel

man -k 2fs

man e2fsck

umount filesystem (don't forget it)
and e2fsck /dev/h (filesystem)

On Wed, 2006-08-09 at 01:47 +0200, Molle Bestefich wrote:
> I have a ~1TB filesystem that fails to mount, the message is:
> 
> EXT3-fs error (device loop0): ext3_check_descriptors: Block bitmap for
> group 2338 not in group (block 1607003381)!
> EXT3-fs: group descriptors corrupted !
> 
> A day before, it worked flawlessly.
> 
> What could have happened, and what's the best course of action?
> -
> To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
> the body of a message to majordomo@vger.kernel.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html
> Please read the FAQ at  http://www.tux.org/lkml/


^ permalink raw reply	[flat|nested] 49+ messages in thread

* Re: ext3 corruption
  2006-08-08 23:47 ` Molle Bestefich
  2006-08-09  1:33   ` Sergio Monteiro Basto
@ 2006-08-09 10:36   ` Molle Bestefich
  2006-08-09 11:33   ` linux-os (Dick Johnson)
  2 siblings, 0 replies; 49+ messages in thread
From: Molle Bestefich @ 2006-08-09 10:36 UTC (permalink / raw)
  To: linux-kernel; +Cc: sergio

Molle Bestefich wrote:
> I have a ~1TB filesystem that fails to mount, the message is:
>
> EXT3-fs error (device loop0): ext3_check_descriptors: Block bitmap for
> group 2338 not in group (block 1607003381)!
> EXT3-fs: group descriptors corrupted !
>
> A day before, it worked flawlessly.
>
> What could have happened, and what's the best course of action?

I should probably mention that I've been bitten by e2fsck before.
I had a filesystem with minor damage, but after running e2fsck it was
completely nuked.
Nothing was recoverable.

So before anyone suggest running e2fsck, I'd really like someone
knowledgeable to tell me what e2fsck is going to do about "group
descriptors corrupted" *BEFORE* I go ahead and blindly run it.

^ permalink raw reply	[flat|nested] 49+ messages in thread

* Re: ext3 corruption
  2006-08-08 23:47 ` Molle Bestefich
  2006-08-09  1:33   ` Sergio Monteiro Basto
  2006-08-09 10:36   ` Molle Bestefich
@ 2006-08-09 11:33   ` linux-os (Dick Johnson)
  2006-08-09 15:22     ` Molle Bestefich
  2 siblings, 1 reply; 49+ messages in thread
From: linux-os (Dick Johnson) @ 2006-08-09 11:33 UTC (permalink / raw)
  To: Molle Bestefich; +Cc: linux-kernel

On Tue, 8 Aug 2006, Molle Bestefich wrote:

> I have a ~1TB filesystem that fails to mount, the message is:
>
> EXT3-fs error (device loop0): ext3_check_descriptors: Block bitmap for
                  ^^^^^^^^^^^_________

It seems as though you have a LOT of RAM if you can make a 1TB
filesystem on the loopback device!

Seriously, what are you doing, attempting to mount a big file-system
through the loop-back device or is this a copied-down message message
you got during boot when initrd tried to mount a RAM disk?

> group 2338 not in group (block 1607003381)!
> EXT3-fs: group descriptors corrupted !
>

Ordinary disk repair involves running fsck on an UNMOUNTED file-system.

> A day before, it worked flawlessly.
>
> What could have happened, and what's the best course of action?

Any bad RAM, any shutdown without a proper unmount, any device hardware
error like DMA not completing properly, can cause file-system corruption.
That's why there are tools to fix it.

Cheers,
Dick Johnson
Penguin : Linux version 2.6.16.24 on an i686 machine (5592.62 BogoMips).
New book: http://www.AbominableFirebug.com/
_
\x1a\x04

****************************************************************
The information transmitted in this message is confidential and may be privileged.  Any review, retransmission, dissemination, or other use of this information by persons or entities other than the intended recipient is prohibited.  If you are not the intended recipient, please notify Analogic Corporation immediately - by replying to this message or by sending an email to DeliveryErrors@analogic.com - and destroy all copies of this information, including any attachments, without reading or disclosing them.

Thank you.

^ permalink raw reply	[flat|nested] 49+ messages in thread

* Re: ext3 corruption
  2006-08-09 11:33   ` linux-os (Dick Johnson)
@ 2006-08-09 15:22     ` Molle Bestefich
  2006-08-09 15:38       ` Michael Loftis
  2006-08-10  7:44       ` Denis Vlasenko
  0 siblings, 2 replies; 49+ messages in thread
From: Molle Bestefich @ 2006-08-09 15:22 UTC (permalink / raw)
  To: linux-kernel

linux-os wrote:
> Molle Bestefich wrote:
> > I have a ~1TB filesystem that fails to mount, the message is:
> >
> > EXT3-fs error (device loop0): ext3_check_descriptors: Block bitmap for
>                   ^^^^^^^^^^^_________
>
> It seems as though you have a LOT of RAM if you can make a 1TB
> filesystem on the loopback device!

Why is that?
loop0 is backed by a MD device.

> Seriously, what are you doing, attempting to mount a big file-system
> through the loop-back device

Yes, and it has worked for...  well... many years now.

> or is this a copied-down message message
> you got during boot when initrd tried to mount a RAM disk?

No.

> > group 2338 not in group (block 1607003381)!
> > EXT3-fs: group descriptors corrupted !
>
> Ordinary disk repair involves running fsck on an UNMOUNTED file-system.

It _is_ unmounted.

(I've learned that lesson years ago.  Probably after seeing fsck
complaining loudly when I tried to run it on a mounted filesystem, if
I had to guess ;-).)

> > A day before, it worked flawlessly.
> >
> > What could have happened, and what's the best course of action?
>
> Any bad RAM, any shutdown without a proper unmount, any device hardware
> error like DMA not completing properly, can cause file-system corruption.
> That's why there are tools to fix it.

The hardware works flawlessly.
The shutdown was a regular shutdown -h.

Messages on the console indicated that Linux actually tried to
shutdown the filesystem before shutting down Samba, which is just
plain Real-F......-Stupid.  Is there no intelligent ordering of
shutdown events in Linux at all?

Samba was serving files to remote computers and had no desire to let
go of the filesystem while still running.  After 5 seconds or so,
Linux just shutdown the MD device with the filesystem still mounted.

That's what happened on a user-visible level, but what could have
happened internally in the filesystem?

^ permalink raw reply	[flat|nested] 49+ messages in thread

* Re: ext3 corruption
  2006-08-09 15:22     ` Molle Bestefich
@ 2006-08-09 15:38       ` Michael Loftis
  2006-08-09 18:28         ` Molle Bestefich
  2006-08-10  7:44       ` Denis Vlasenko
  1 sibling, 1 reply; 49+ messages in thread
From: Michael Loftis @ 2006-08-09 15:38 UTC (permalink / raw)
  To: Molle Bestefich, linux-kernel

--On August 9, 2006 5:22:28 PM +0200 Molle Bestefich 
<molle.bestefich@gmail.com> wrote:

> Messages on the console indicated that Linux actually tried to
> shutdown the filesystem before shutting down Samba, which is just
> plain Real-F......-Stupid.  Is there no intelligent ordering of
> shutdown events in Linux at all?

The kernel doesn't perform those, your distro's init scripts do that.  And 
various distros have various success at doing the right thing.  I've had 
the best luck with Debian and Ubuntu doing this in the right order.  RH 
seems to insist on turning off the network then network services such as 
sshd.

> Samba was serving files to remote computers and had no desire to let
> go of the filesystem while still running.  After 5 seconds or so,
> Linux just shutdown the MD device with the filesystem still mounted.

The kernel probably didn't do this, usually by the time the kernel gets to 
this point init has already sent kills to everything.  If it hasn't it 
points to problems with your init scripts, not the kernel.

>
> That's what happened on a user-visible level, but what could have
> happened internally in the filesystem?

^ permalink raw reply	[flat|nested] 49+ messages in thread

* Re: ext3 corruption
  2006-08-09 15:38       ` Michael Loftis
@ 2006-08-09 18:28         ` Molle Bestefich
  2006-08-09 18:41           ` Mws
                             ` (3 more replies)
  0 siblings, 4 replies; 49+ messages in thread
From: Molle Bestefich @ 2006-08-09 18:28 UTC (permalink / raw)
  To: Michael Loftis; +Cc: linux-kernel

Michael Loftis wrote:
> > Is there no intelligent ordering of
> > shutdown events in Linux at all?
>
> The kernel doesn't perform those, your distro's init scripts do that.

Right.  It's all just "Linux" to me ;-).

(Maybe the kernel SHOULD coordinate it somehow,
 seems like some of the distros are doing a pretty bad job as is.)

> And various distros have various success at doing the right thing.  I've had
> the best luck with Debian and Ubuntu doing this in the right order.  RH
> seems to insist on turning off the network then network services such as
> sshd.

Seems things are worse than that.  Seems like it actually kills the
block device before it has successfully (or forcefully) unmounted the
filesystems.  Thus the killing must also be before stopping Samba,
since that's what was (always is) holding the filesystem.

It's indeed a redhat, though - Red Hat Linux release 9 (Shrike).

> > Samba was serving files to remote computers and had no desire to let
> > go of the filesystem while still running.  After 5 seconds or so,
> > Linux just shutdown the MD device with the filesystem still mounted.
>
> The kernel probably didn't do this, usually by the time the kernel gets to
> this point init has already sent kills to everything.  If it hasn't it
> points to problems with your init scripts, not the kernel.

Ok, so LKML is not appropriate for the init script issue.
Never mind that, I'll just try another distro when time comes.

I'd really like to know what the "Block bitmap for group not in group"
message means (block bitmap is pretty self explanatory, but what's a
group?).

And what will e2fsck do to my dear filesystem if I let it have a go at it?

^ permalink raw reply	[flat|nested] 49+ messages in thread

* Re: ext3 corruption
  2006-08-09 18:28         ` Molle Bestefich
@ 2006-08-09 18:41           ` Mws
  2006-08-09 20:17           ` Duane Griffin
                             ` (2 subsequent siblings)
  3 siblings, 0 replies; 49+ messages in thread
From: Mws @ 2006-08-09 18:41 UTC (permalink / raw)
  To: Molle Bestefich; +Cc: Michael Loftis, linux-kernel

[-- Attachment #1: Type: text/plain, Size: 2649 bytes --]

On Wednesday 09 August 2006 20:28, Molle Bestefich wrote:
> Michael Loftis wrote:
> > > Is there no intelligent ordering of
> > > shutdown events in Linux at all?
> >
> > The kernel doesn't perform those, your distro's init scripts do that.
> 
> Right.  It's all just "Linux" to me ;-).
> 
> (Maybe the kernel SHOULD coordinate it somehow,
>  seems like some of the distros are doing a pretty bad job as is.)
> 
> > And various distros have various success at doing the right thing.  I've had
> > the best luck with Debian and Ubuntu doing this in the right order.  RH
> > seems to insist on turning off the network then network services such as
> > sshd.
> 
> Seems things are worse than that.  Seems like it actually kills the
> block device before it has successfully (or forcefully) unmounted the
> filesystems.  Thus the killing must also be before stopping Samba,
> since that's what was (always is) holding the filesystem.
> 
> It's indeed a redhat, though - Red Hat Linux release 9 (Shrike).
> 
> > > Samba was serving files to remote computers and had no desire to let
> > > go of the filesystem while still running.  After 5 seconds or so,
> > > Linux just shutdown the MD device with the filesystem still mounted.
> >
> > The kernel probably didn't do this, usually by the time the kernel gets to
> > this point init has already sent kills to everything.  If it hasn't it
> > points to problems with your init scripts, not the kernel.
> 
> Ok, so LKML is not appropriate for the init script issue.
> Never mind that, I'll just try another distro when time comes.
> 
> I'd really like to know what the "Block bitmap for group not in group"
> message means (block bitmap is pretty self explanatory, but what's a
> group?).
> 
> And what will e2fsck do to my dear filesystem if I let it have a go at it?
> -
hi, 
what i am missing is a kind of information, what type of pc you own/use.

i personally builded a new one the last few days and also encountered
problems with ext3.

i do own a amd64 x2 5000+ with asus m2n32 ws pro motherboard.

i yesterday changed my root partition from ext3 to xfs and my problems
went away. so imho there might be some issues in having 64 bit systems,
dual processor and ext3 in combination.

kernel is 2.6.17

behaviour was like: 
filesystem became corrupted due to uncommitted transactions, resulting 
in manually "fsck" checking the partition, loads of errors i did correct, but
a lot of files got corrupted. i didn't check if the sata attached drives would also
fail on ext3 cause i had them already prepared for xfs.

regards
marcel





[-- Attachment #2: Type: application/pgp-signature, Size: 189 bytes --]

^ permalink raw reply	[flat|nested] 49+ messages in thread

* Re: ext3 corruption
  2006-08-09 18:28         ` Molle Bestefich
  2006-08-09 18:41           ` Mws
@ 2006-08-09 20:17           ` Duane Griffin
  2006-08-09 20:47             ` Molle Bestefich
  2006-08-10  3:06           ` Jim Crilly
  2006-08-10  8:32           ` Bernd Petrovitsch
  3 siblings, 1 reply; 49+ messages in thread
From: Duane Griffin @ 2006-08-09 20:17 UTC (permalink / raw)
  To: Molle Bestefich; +Cc: linux-kernel

On 09/08/06, Molle Bestefich <molle.bestefich@gmail.com> wrote:
[snip]
> And what will e2fsck do to my dear filesystem if I let it have a go at it?

To be safe, run it on an image of your filesystem first. You can use
the dd command to take the image, then run e2fsck on it. Afterwards
mount it and make sure everything looks kosher. That is assuming you
have enough spare space, of course. If not then you should at least
run e2fsck with -n first to find out what it wants to do. Personally,
my risk tolerance would be closely correlated with the quality of my
backups.

Cheers,
Duane.

-- 
"I never could learn to drink that blood and call it wine" - Bob Dylan

^ permalink raw reply	[flat|nested] 49+ messages in thread

* Re: ext3 corruption
  2006-08-09 20:17           ` Duane Griffin
@ 2006-08-09 20:47             ` Molle Bestefich
       [not found]               ` <e9e943910608091527t3b88da7eo837f6adc1e1e6f98@mail.gmail.com>
  0 siblings, 1 reply; 49+ messages in thread
From: Molle Bestefich @ 2006-08-09 20:47 UTC (permalink / raw)
  To: Duane Griffin; +Cc: linux-kernel

Duane Griffin wrote:
> > And what will e2fsck do to my dear filesystem if I let it have a go at it?
>
> To be safe, run it on an image of your filesystem first.

Yes, hmm, I don't have another terabyte handy, unfortunately.

> That is assuming you have enough spare space, of course.
> If not then you should at least run e2fsck with -n first to find out
> what it wants to do.

How close to 1-1 does "-n" relate to non-"-n" ?

For example, does e2fsck take into consideration the changes it would
have done itself in regular mode when it proceeds to the next problem
and/or phase of a -n operation?

If it doesn't, then that command is, well, totally useless.

So :-).  Does it take that into consideration?

> Personally, my risk tolerance would be closely correlated with
> the quality of my backups.

I hear you loud and clear...
Sigh ;-).

> "I never could learn to drink that blood and call it wine" - Bob Dylan

Hmm.  I like it.

^ permalink raw reply	[flat|nested] 49+ messages in thread

* Re: ext3 corruption
       [not found]               ` <e9e943910608091527t3b88da7eo837f6adc1e1e6f98@mail.gmail.com>
@ 2006-08-09 23:09                 ` Molle Bestefich
  2006-08-10  0:08                   ` Duane Griffin
  0 siblings, 1 reply; 49+ messages in thread
From: Molle Bestefich @ 2006-08-09 23:09 UTC (permalink / raw)
  To: Duane Griffin; +Cc: linux-kernel

Duane Griffin wrote:
> > How close to 1-1 does "-n" relate to non-"-n" ?
> >
> > For example, does e2fsck take into consideration the changes it would
> > have done itself in regular mode when it proceeds to the next problem
> > and/or phase of a -n operation?
>
> It corresponds perfectly to you answering "no" to all questions :)
> Sorry, I don't have a much better answer than that.

A good answer, even if it's one that can be found in the manual :-).

> > If it doesn't, then that command is, well, totally useless.
>
> That is too strong.

I don't think so.

If it doesn't take into account own changes, then the -n command is
unable to produce even a slightly accurate resemblence of what would
happen if I did a real run.

And that's about the only use case I can come up with for -n...

> You should be able to get an idea how severe the damage
> is, at least.

If it's complete inaccurate, I can't trust the result, so that doesn't
help me much, if any.

> From a quick read of the code it looks like your problem
> is related to dodgy data in the superblock, and e2fsck will attempt to
> recover & continue by reading the backup superblock.

Thanks a lot for checking !

I wonder then, will it write back this alternate superblock?

Is there anything I can do to control the process, like:
Do a test mount with one of the alternate superblocks?
Tell fsck to test a specific superblock; afterwards tell fsck to use a
specific superblock?

That would be useful.

> It does that regardless of whether you use -n,
> so in that respect at least it will operate in the
> same way as "normal" operation.

Ok, that's very good to know, thanks a lot.

^ permalink raw reply	[flat|nested] 49+ messages in thread

* Re: ext3 corruption
  2006-08-09 23:09                 ` Molle Bestefich
@ 2006-08-10  0:08                   ` Duane Griffin
  2006-08-10 21:00                     ` Molle Bestefich
  0 siblings, 1 reply; 49+ messages in thread
From: Duane Griffin @ 2006-08-10  0:08 UTC (permalink / raw)
  To: Molle Bestefich; +Cc: linux-kernel

On 10/08/06, Molle Bestefich <molle.bestefich@gmail.com> wrote:
> If it doesn't take into account own changes, then the -n command is
> unable to produce even a slightly accurate resemblence of what would
> happen if I did a real run.

It takes into account some of them (such as reading data from the
backup superblock if it detects corruption). Others will be irrelevent
for further operations. Many reports will be accurate, especially
fatal ones. I consider that useful, YMMV.

Cheers,
Duane.

-- 
"I never could learn to drink that blood and call it wine" - Bob Dylan

^ permalink raw reply	[flat|nested] 49+ messages in thread

* Re: ext3 corruption
  2006-08-09 18:28         ` Molle Bestefich
  2006-08-09 18:41           ` Mws
  2006-08-09 20:17           ` Duane Griffin
@ 2006-08-10  3:06           ` Jim Crilly
  2006-08-10  9:48             ` Molle Bestefich
  2006-08-10  8:32           ` Bernd Petrovitsch
  3 siblings, 1 reply; 49+ messages in thread
From: Jim Crilly @ 2006-08-10  3:06 UTC (permalink / raw)
  To: Molle Bestefich; +Cc: Michael Loftis, linux-kernel

On 08/09/06 08:28:34PM +0200, Molle Bestefich wrote:
> Michael Loftis wrote:
> >> Is there no intelligent ordering of
> >> shutdown events in Linux at all?
> >
> >The kernel doesn't perform those, your distro's init scripts do that.
> 
> Right.  It's all just "Linux" to me ;-).
> 

Then I guess it's time to break out the learning cap and figure out what's
what. =)

> (Maybe the kernel SHOULD coordinate it somehow,
> seems like some of the distros are doing a pretty bad job as is.)
> 

That's pretty much impossible, the best the kernel can do is send signals
to all of the running processes. If anything requires anything more
complicated (and many do) then even worse things will happen.

> >And various distros have various success at doing the right thing.  I've 
> >had
> >the best luck with Debian and Ubuntu doing this in the right order.  RH
> >seems to insist on turning off the network then network services such as
> >sshd.
> 
> Seems things are worse than that.  Seems like it actually kills the
> block device before it has successfully (or forcefully) unmounted the
> filesystems.  Thus the killing must also be before stopping Samba,
> since that's what was (always is) holding the filesystem.
> 
> It's indeed a redhat, though - Red Hat Linux release 9 (Shrike).
> 

Why are you using such an old distribution? I know it's only been 3 years,
but a lot has changed and I don't think anyone supports RH9 or earlier
anymore.

> >> Samba was serving files to remote computers and had no desire to let
> >> go of the filesystem while still running.  After 5 seconds or so,
> >> Linux just shutdown the MD device with the filesystem still mounted.
> >
> >The kernel probably didn't do this, usually by the time the kernel gets to
> >this point init has already sent kills to everything.  If it hasn't it
> >points to problems with your init scripts, not the kernel.
> 
> Ok, so LKML is not appropriate for the init script issue.
> Never mind that, I'll just try another distro when time comes.
> 
> I'd really like to know what the "Block bitmap for group not in group"
> message means (block bitmap is pretty self explanatory, but what's a
> group?).
> 

ext2 breaks the filesystem up into block groups, a while guess about the
error message would be that it couldn't find the block bitmap for a certain
group or the bitmap that it did find wasn't in the correct group.

Jim.

^ permalink raw reply	[flat|nested] 49+ messages in thread

* Re: ext3 corruption
  2006-08-09 15:22     ` Molle Bestefich
  2006-08-09 15:38       ` Michael Loftis
@ 2006-08-10  7:44       ` Denis Vlasenko
  1 sibling, 0 replies; 49+ messages in thread
From: Denis Vlasenko @ 2006-08-10  7:44 UTC (permalink / raw)
  To: Molle Bestefich; +Cc: linux-kernel

On Wednesday 09 August 2006 17:22, Molle Bestefich wrote:
> The hardware works flawlessly.
> The shutdown was a regular shutdown -h.
> 
> Messages on the console indicated that Linux actually tried to
> shutdown the filesystem before shutting down Samba, which is just
> plain Real-F......-Stupid.  Is there no intelligent ordering of
> shutdown events in Linux at all?

There is no shutdown ordering in the Linux *kernel*, it is
the responsibility of the userspace to arrange for that.
IOW: the distribution packagers should do it,
or you, if you maintain your custom-configured system.

> Samba was serving files to remote computers and had no desire to let
> go of the filesystem while still running.  After 5 seconds or so,

Somebody forgot to add a kill -9 to the shutdown scripts.

> Linux just shutdown the MD device with the filesystem still mounted.
> 
> That's what happened on a user-visible level, but what could have
> happened internally in the filesystem?
--
vda

^ permalink raw reply	[flat|nested] 49+ messages in thread

* Re: ext3 corruption
  2006-08-09 18:28         ` Molle Bestefich
                             ` (2 preceding siblings ...)
  2006-08-10  3:06           ` Jim Crilly
@ 2006-08-10  8:32           ` Bernd Petrovitsch
  3 siblings, 0 replies; 49+ messages in thread
From: Bernd Petrovitsch @ 2006-08-10  8:32 UTC (permalink / raw)
  To: Molle Bestefich; +Cc: Michael Loftis, linux-kernel

On Wed, 2006-08-09 at 20:28 +0200, Molle Bestefich wrote:
> Michael Loftis wrote:
> > > Is there no intelligent ordering of
> > > shutdown events in Linux at all?
> >
> > The kernel doesn't perform those, your distro's init scripts do that.
> 
> Right.  It's all just "Linux" to me ;-).

Then you are very probably questioning at the wrong place.

> (Maybe the kernel SHOULD coordinate it somehow,
>  seems like some of the distros are doing a pretty bad job as is.)

Patch your "Linux" to dump the output of "strace" of the init scripts
(it should be enough to improve the correct line in /etc/inittab) into a
log file and have fun considering the heuristics to be used in the
kernel to detect the dependencies.

AFAIK typical init scripts, the expression "extremely hard" is the
understatement of the year for this task.

	Bernd
-- 
Firmix Software GmbH                   http://www.firmix.at/
mobil: +43 664 4416156                 fax: +43 1 7890849-55
          Embedded Linux Development and Services


^ permalink raw reply	[flat|nested] 49+ messages in thread

* Re: ext3 corruption
  2006-08-10  3:06           ` Jim Crilly
@ 2006-08-10  9:48             ` Molle Bestefich
  2006-08-10 11:41               ` linux-os (Dick Johnson)
                                 ` (2 more replies)
  0 siblings, 3 replies; 49+ messages in thread
From: Molle Bestefich @ 2006-08-10  9:48 UTC (permalink / raw)
  To: linux-kernel

Duane Griffin wrote:
> It takes into account some of them (such as reading data from the
> backup superblock if it detects corruption). Others will be
> irrelevent for further operations.

Ok, maybe it is accurate?

> Many reports will be accurate

Ok, perhaps not then :-).

I'm still confused as to the performance of "-n".
It would be _very_ good to fix this deficiency in the man page of e2fsck.

Thanks Duane, you've been most helpful.

Jim Crilly wrote:
> > Right.  It's all just "Linux" to me ;-).
>
> Then I guess it's time to break out the learning cap and figure
> out what's what. =)

;-).

You can start by phoning Red Hat.  They call their entire product
"Red Hat Linux", so that pretty much means that "Linux" basically
covers everything, not just the kernel.

> > It's indeed a redhat, though - Red Hat Linux release 9 (Shrike).
>
> Why are you using such an old distribution? I know it's only been 3
> years, but a lot has changed and I don't think anyone supports RH9
> or earlier anymore.

As far as I remember, I configured it to automatically update everything.
Apparently that function just broke itself very early on :-).

I guess the problem is that I don't know a single Linux packaging system
that actually works well enough to keep a system up to date at all times,
and I don't have any free time to spend on reinstalling systems all the
time.

I think most of the package managers break because their dependency system
sucks.  Some of them doesn't suck, but they break because there's no
integrity checks, and package maintainers can dump any kind of bizarre
corrupt dependencies they like into them.  That's how Gentoo works, for
example.  Others have even more bizarre ways of breaking, again Gentoo as
an example requires the user to run a "switch to newer GCC" command from
time to time, otherwise random packages just start breaking.

AFAIK, every single Linux package manager on the planet is half-ass, broken
like above or in some other way.  If you know of one that's actually well
thought through on all planes and well implemented and thus works good enough
to keep a system up to date for three years in a row without human
intervention....
Please speak up!!!

> > (Maybe the kernel SHOULD coordinate it somehow,
> > seems like some of the distros are doing a pretty bad job as is.)
>
> That's pretty much impossible, the best the kernel can do is send
> signals to all of the running processes.

Impossible?  Few things in the software world are impossible.

Surely it's possible to create a kernel interface where processes
can tell the kernel about which other processes they'd like to
outlive and which ones they'd like to get killed before.

The kernel could then coordinate the killing of processes in a
"shutdown" function, which the various distro's 'reboot' and
'shutdown' scripts could call.

And voila, that difficult task of assessing in which order to do
things is out of the hands of distros like Red Hat, and into the
hands of those people who actually make the binaries.

Which is probably a good thing, because

a) Red Hat's init scripts probably fails for me because there's
   something in my setup that Red Hat didn't expect.  A greatly
   simplified system as outlined above would help to fix things
   like this.

b) Less duplicated effort in the form of init script coding for
   the distro maintainers.

I realize that details totally absent in the above, but at least
it doesn't look to me like it's impossible at all.

> ext2 breaks the filesystem up into block groups,

Thanks for the info!

> a wild guess about the error message would be that it couldn't
> find the block bitmap for a certain group

Hmm, I would have expected it to find something completely
corrupt somewhere instead of finding nothing at all.

> or the bitmap that it did find wasn't in the correct group.

Implying that they're linked both ways?
That would probably be a very good thing wrt. recoverability.
Interesting thought!

^ permalink raw reply	[flat|nested] 49+ messages in thread

* Re: ext3 corruption
  2006-08-10  9:48             ` Molle Bestefich
@ 2006-08-10 11:41               ` linux-os (Dick Johnson)
  2006-08-10 12:21                 ` Molle Bestefich
  2006-08-10 12:19               ` Helge Hafting
  2006-08-10 16:10               ` John Stoffel
  2 siblings, 1 reply; 49+ messages in thread
From: linux-os (Dick Johnson) @ 2006-08-10 11:41 UTC (permalink / raw)
  To: Molle Bestefich; +Cc: Linux kernel


On Thu, 10 Aug 2006, Molle Bestefich wrote:

> Duane Griffin wrote:
>> It takes into account some of them (such as reading data from the
>> backup superblock if it detects corruption). Others will be
>> irrelevent for further operations.
>
> Ok, maybe it is accurate?
>
>> Many reports will be accurate
>
> Ok, perhaps not then :-).
>
> I'm still confused as to the performance of "-n".
> It would be _very_ good to fix this deficiency in the man page of e2fsck.
>
>
> Thanks Duane, you've been most helpful.
>
>
> Jim Crilly wrote:
>>> Right.  It's all just "Linux" to me ;-).
>>
>> Then I guess it's time to break out the learning cap and figure
>> out what's what. =)
>
> ;-).
>
> You can start by phoning Red Hat.  They call their entire product
> "Red Hat Linux", so that pretty much means that "Linux" basically
> covers everything, not just the kernel.
>
>
>>> It's indeed a redhat, though - Red Hat Linux release 9 (Shrike).
>>
>> Why are you using such an old distribution? I know it's only been 3
>> years, but a lot has changed and I don't think anyone supports RH9
>> or earlier anymore.
>
> As far as I remember, I configured it to automatically update everything.
> Apparently that function just broke itself very early on :-).
>
> I guess the problem is that I don't know a single Linux packaging system
> that actually works well enough to keep a system up to date at all times,
> and I don't have any free time to spend on reinstalling systems all the
> time.
>
> I think most of the package managers break because their dependency system
> sucks.  Some of them doesn't suck, but they break because there's no
> integrity checks, and package maintainers can dump any kind of bizarre
> corrupt dependencies they like into them.  That's how Gentoo works, for
> example.  Others have even more bizarre ways of breaking, again Gentoo as
> an example requires the user to run a "switch to newer GCC" command from
> time to time, otherwise random packages just start breaking.
>
> AFAIK, every single Linux package manager on the planet is half-ass, broken
> like above or in some other way.  If you know of one that's actually well
> thought through on all planes and well implemented and thus works good enough
> to keep a system up to date for three years in a row without human
> intervention....
> Please speak up!!!
>
>
>>> (Maybe the kernel SHOULD coordinate it somehow,
>>> seems like some of the distros are doing a pretty bad job as is.)
>>
>> That's pretty much impossible, the best the kernel can do is send
>> signals to all of the running processes.
>
> Impossible?  Few things in the software world are impossible.
>
> Surely it's possible to create a kernel interface where processes
> can tell the kernel about which other processes they'd like to
> outlive and which ones they'd like to get killed before.
>
> The kernel could then coordinate the killing of processes in a
> "shutdown" function, which the various distro's 'reboot' and
> 'shutdown' scripts could call.
>
> And voila, that difficult task of assessing in which order to do
> things is out of the hands of distros like Red Hat, and into the
> hands of those people who actually make the binaries.
>
> Which is probably a good thing, because
>
> a) Red Hat's init scripts probably fails for me because there's
>   something in my setup that Red Hat didn't expect.  A greatly
>   simplified system as outlined above would help to fix things
>   like this.
>
> b) Less duplicated effort in the form of init script coding for
>   the distro maintainers.
>
> I realize that details totally absent in the above, but at least
> it doesn't look to me like it's impossible at all.
>
>
>> ext2 breaks the filesystem up into block groups,
>
> Thanks for the info!
>
>> a wild guess about the error message would be that it couldn't
>> find the block bitmap for a certain group
>
> Hmm, I would have expected it to find something completely
> corrupt somewhere instead of finding nothing at all.
>
>> or the bitmap that it did find wasn't in the correct group.
>
> Implying that they're linked both ways?
> That would probably be a very good thing wrt. recoverability.
> Interesting thought!

What is it that you are attempting to do? First you show us some
text obtained while attempting to run fsck on the loop device,
claiming that this was obtained from a 1TB file-system that
was destroyed by Linux. Then you spend several days telling us
that linux is no good. Enough is enough.

If you had a 1TB file-system and you knew anything about Unix or
Linux, it would have been fixed by now -- and BTW, samba can't
destroy a file-system, no matter how many files were open.
The worse possible situation is that files, open for write, may
not be completely written and this only for files that were
being created or extended. You still have the original file-data
and all the rest of the files on your file system.

Another point... ext3 is a journaled file-system. Even when
forced off by hitting the reset switch, ext3 will quietly
announce "recovering from journal" and mount just fine.

Cheers,
Dick Johnson
Penguin : Linux version 2.6.16.24 on an i686 machine (5592.62 BogoMips).
New book: http://www.AbominableFirebug.com/
_
\x1a\x04

****************************************************************
The information transmitted in this message is confidential and may be privileged.  Any review, retransmission, dissemination, or other use of this information by persons or entities other than the intended recipient is prohibited.  If you are not the intended recipient, please notify Analogic Corporation immediately - by replying to this message or by sending an email to DeliveryErrors@analogic.com - and destroy all copies of this information, including any attachments, without reading or disclosing them.

Thank you.

^ permalink raw reply	[flat|nested] 49+ messages in thread

* Re: ext3 corruption
  2006-08-10  9:48             ` Molle Bestefich
  2006-08-10 11:41               ` linux-os (Dick Johnson)
@ 2006-08-10 12:19               ` Helge Hafting
  2006-08-10 13:00                 ` Molle Bestefich
  2006-09-24  8:56                 ` Molle Bestefich
  2006-08-10 16:10               ` John Stoffel
  2 siblings, 2 replies; 49+ messages in thread
From: Helge Hafting @ 2006-08-10 12:19 UTC (permalink / raw)
  To: Molle Bestefich; +Cc: linux-kernel

Molle Bestefich wrote:
[...]
>
> As far as I remember, I configured it to automatically update everything.
> Apparently that function just broke itself very early on :-).
>
> I guess the problem is that I don't know a single Linux packaging system
> that actually works well enough to keep a system up to date at all times,
> and I don't have any free time to spend on reinstalling systems all the
> time.
Have you considered debian then?  That package system certainly
have been able to keep many a system running for years and years.
You never reinstall, just upgrade.
>
> I think most of the package managers break because their dependency 
> system
> sucks.  Some of them doesn't suck, but they break because there's no
> integrity checks, and package maintainers can dump any kind of bizarre
> corrupt dependencies they like into them.  That's how Gentoo works, for
Sure, you have to trust someone.  If you can't trust any distro and its
maintainers, your only choice is to roll your own.
> example.  Others have even more bizarre ways of breaking, again Gentoo as
> an example requires the user to run a "switch to newer GCC" command from
> time to time, otherwise random packages just start breaking.
>
> AFAIK, every single Linux package manager on the planet is half-ass, 
> broken
> like above or in some other way.  If you know of one that's actually well
> thought through on all planes and well implemented and thus works good 
> enough
> to keep a system up to date for three years in a row without human
> intervention....
> Please speak up!!!
Well, if you expect to keep a computer running for three years
without human intervention - good luck to you!  Not only will there be
new vulnerabilities and attacks via the internet in that time, there is also
a substantial risk of hardware breakdown.  Disks in particular seems not
to live much more than three years, and if you have several, the chance
is bigger that one goes.  Raid protects the data, but surely you will 
intervene
to replace the failed drive?  Or do you install six hot spares so the
system really can keep running alone?

It is certainly possible to run debian and spend 5min per week
on running "apt-get dist-upgrade" which installs anything that
was upgraded since the last time.  And if you have many identical
servers, then you know that when one upgrade went without
problems the others will too.
>> > (Maybe the kernel SHOULD coordinate it somehow,
>> > seems like some of the distros are doing a pretty bad job as is.)
>>
>> That's pretty much impossible, the best the kernel can do is send
>> signals to all of the running processes.
>
> Impossible?  Few things in the software world are impossible.
>
> Surely it's possible to create a kernel interface where processes
> can tell the kernel about which other processes they'd like to
> outlive and which ones they'd like to get killed before.
>
> The kernel could then coordinate the killing of processes in a
> "shutdown" function, which the various distro's 'reboot' and
> 'shutdown' scripts could call.
Such coordination can certainly be done - there is just no
reason at all to do it in the _kernel_.  This is the job of the
program called "init".  (The kernel is a very important piece of
a linux system, but don't make the mistake of believing
it therefore should be the "general manager" for all things less
important.)

Init is the first program that runs,
it is started up by the kernel.  Init will then start all other
software you need, such as samba, any other server software,
and of course the login services. When you shut the pc down,
init is the program responsible for stopping everything in
a sane order.

Init is customizable, by editing and/or renaming the so-called
initscripts.  That way, you can alter the order of startup and
shutdown of software, if your distribution didn't get this right.
This isn't all that hard, and a linux system administrator is
supposed to be able to make simple adjustments in this area.
Many linux/unix books documents how initscripts work, and there
is usually plenty of online documentation as well.

> And voila, that difficult task of assessing in which order to do
> things is out of the hands of distros like Red Hat, and into the
> hands of those people who actually make the binaries.
Not so easy.  You do not want to shut down md devices because
samba is using them. Someone else may run samba on a single
harddisk and also have some md-devices that they take down
and bring up a lot.  So having samba generally depend on md doesn't
work.  Your setup need it, others may have different needs.
That's why the initscripts are _scripts_, simple textfiles that
administrators can manipulate without having to know C programming.
>
> Which is probably a good thing, because
>
> a) Red Hat's init scripts probably fails for me because there's
>   something in my setup that Red Hat didn't expect.  A greatly
>   simplified system as outlined above would help to fix things
>   like this.
Learn to manipulate the initscripts then.  Changing the
shutdown order really is as simple as renaming samba's
script file so it occur earlier in the shutdown order than
the script responsible for taking down the md devices.
You don't even need to understand shellscript programming to
re-order stuff.
>
> b) Less duplicated effort in the form of init script coding for
>   the distro maintainers.
It is open source, they can copy each others's scripts to save
effort.  They often do -  and sometimes the initscript for a
particular piece of sw is written by the maker of that sw too.

Helge Hafting

^ permalink raw reply	[flat|nested] 49+ messages in thread

* Re: ext3 corruption
  2006-08-10 11:41               ` linux-os (Dick Johnson)
@ 2006-08-10 12:21                 ` Molle Bestefich
  0 siblings, 0 replies; 49+ messages in thread
From: Molle Bestefich @ 2006-08-10 12:21 UTC (permalink / raw)
  To: Linux kernel

linux-os (Dick Johnson) wrote:
> What is it that you are attempting to do?

Fix my filesystem.
Prevent this situation from happening for others, if at all possible.

> First you show us some text obtained while attempting
> to run fsck on the loop device,
> claiming that this was obtained from a 1TB file-system that
> was destroyed by Linux. Then you spend several days telling us
> that linux is no good. Enough is enough.

That was never really my point, apologies if it came across that way.

> If you had a 1TB file-system and you knew anything about Unix or
> Linux, it would have been fixed by now

What?  Uh.
Well, whatever.

> -- and BTW, samba can't
> destroy a file-system, no matter how many files were open.

Never claimed that it did.

> The worse possible situation is that files, open for write, may
> not be completely written and this only for files that were
> being created or extended. You still have the original file-data
> and all the rest of the files on your file system.

Well, it doesn't mount, so they're kind of irretrievable right now.

> Another point... ext3 is a journaled file-system. Even when
> forced off by hitting the reset switch, ext3 will quietly
> announce "recovering from journal" and mount just fine.

Obviously that's not true.

^ permalink raw reply	[flat|nested] 49+ messages in thread

* Re: ext3 corruption
  2006-08-10 12:19               ` Helge Hafting
@ 2006-08-10 13:00                 ` Molle Bestefich
  2006-08-10 14:40                   ` gmu 2k6
  2006-09-24  8:56                 ` Molle Bestefich
  1 sibling, 1 reply; 49+ messages in thread
From: Molle Bestefich @ 2006-08-10 13:00 UTC (permalink / raw)
  To: Helge Hafting; +Cc: linux-kernel

Helge Hafting wrote:
> Have you considered debian then?  That package system certainly
> have been able to keep many a system running for years and years.
> You never reinstall, just upgrade.

Sounds cool, I'll try that.
Thanks.

> Well, if you expect to keep a computer running for three years
> without human intervention - good luck to you!  Not only will there be
> new vulnerabilities and attacks via the internet in that time,

It's NATted, so it's basically unreachable unless you know how to get to it.

> there is also a substantial risk of hardware breakdown.

Hasn't happened yet.

> Disks in particular seems not to live much more than three years,
> and if you have several, the chance is bigger that one goes.

Fair enough, but I've got mdmonitor to send me an email when that happens.

> It is certainly possible to run debian and spend 5min per week
> on running "apt-get dist-upgrade" which installs anything that
> was upgraded since the last time.

Cool, a cron job should take care of that.

> And if you have many identical servers, then you know that when
> one upgrade went without problems the others will too.

Oh....  Sounds a bit like it's not really as good as you once said it was.

> > And voila, that difficult task of assessing in which order to do
> > things is out of the hands of distros like Red Hat, and into the
> > hands of those people who actually make the binaries.
>
> Not so easy.  You do not want to shut down md devices because
> samba is using them.

Definitely not, I want to shut down Samba because it's using a
filesystem, which is using a MD device.

> Someone else may run samba on a single harddisk and also have
> some md-devices that they take down and bring up a lot.

Fair enough.
I guess a dependency system would have to be more complicated, then,
taking into account which particular resources a process depends on,
not just which subsystem.

> So having samba generally depend on md doesn't work.

Right.

> Your setup need it, others may have different needs.
> That's why the initscripts are _scripts_, simple textfiles that
> administrators can manipulate without having to know C programming.

I'm not sure I buy your argumentation.  But I do acknowledge that
being able to modify init scripts, without having C knowledge is
definitely a plus, seeing as they're obviously often far from perfect.

> Learn to manipulate the initscripts then.

Yeah, I chould sit down and look at how Red Hat did their whole init thing.

OTOH, I don't really have time to dissect 'em and find what problem
they have.  I think I'll just switch distro and see if I have better
luck with something else.

> Changing the shutdown order really is as
> simple as renaming samba's script file

I have a feeling that this is not the problem.

After all, this should be something that every distro has gotten
right, it's not really an issue you have to think long about.

I think the problem occurs in the "sending processes the TERM/KILL
signal" phase.
Perhaps because one phase is initiated too early, before various
services such as Samba has shut down.

Anyway, let's all forget about the init scripts forthwith, they're not
really relevant for LKML I think.

Concentrate on the ext3 issue :-).

^ permalink raw reply	[flat|nested] 49+ messages in thread

* Re: ext3 corruption
  2006-08-10 13:00                 ` Molle Bestefich
@ 2006-08-10 14:40                   ` gmu 2k6
  0 siblings, 0 replies; 49+ messages in thread
From: gmu 2k6 @ 2006-08-10 14:40 UTC (permalink / raw)
  To: Molle Bestefich; +Cc: linux-kernel

On 8/10/06, Molle Bestefich <molle.bestefich@gmail.com> wrote:
> Helge Hafting wrote:
> > Have you considered debian then?  That package system certainly
> > have been able to keep many a system running for years and years.
> > You never reinstall, just upgrade.
>
> Sounds cool, I'll try that.
> Thanks.
>
> > Well, if you expect to keep a computer running for three years
> > without human intervention - good luck to you!  Not only will there be
> > new vulnerabilities and attacks via the internet in that time,
>
> It's NATted, so it's basically unreachable unless you know how to get to it.

and the villain you have to fear will know much better than anybody
else how to reach the box, that is for sure.

[snip]

> Fair enough, but I've got mdmonitor to send me an email when that happens.
>
> > It is certainly possible to run debian and spend 5min per week
> > on running "apt-get dist-upgrade" which installs anything that
> > was upgraded since the last time.
>
> Cool, a cron job should take care of that.

http://packages.debian.org/testing/admin/cron-apt

[snip]

^ permalink raw reply	[flat|nested] 49+ messages in thread

* Re: ext3 corruption
  2006-08-10  9:48             ` Molle Bestefich
  2006-08-10 11:41               ` linux-os (Dick Johnson)
  2006-08-10 12:19               ` Helge Hafting
@ 2006-08-10 16:10               ` John Stoffel
  2006-08-10 19:10                 ` Molle Bestefich
  2006-08-11 13:26                 ` Horst H. von Brand
  2 siblings, 2 replies; 49+ messages in thread
From: John Stoffel @ 2006-08-10 16:10 UTC (permalink / raw)
  To: Molle Bestefich; +Cc: linux-kernel

>>>>> "Molle" == Molle Bestefich <molle.bestefich@gmail.com> writes:

Molle> I guess the problem is that I don't know a single Linux
Molle> packaging system that actually works well enough to keep a
Molle> system up to date at all times, and I don't have any free time
Molle> to spend on reinstalling systems all the time.

Debian works the best in my experience.  Yum sucks rocks, it's a pain
to configured and when it does pull in stuff, it pulls in *everything*
that you don't want.  

Not that apt-get is perfect, but it works well.  My main machine at
home is an old Debian Stable install upgraded pretty much daily.
Rarely do things break.  And rarely do you want packages updated
without you thinking about it.  

If you need to maintain a bunch of systems, all at the same rev, then
you will obviously need a master system to test on and from which to
push updates to the clients.  In this case, you'd setup your own
private repository which the clients would pull from.  

Molle> I think most of the package managers break because their
Molle> dependency system sucks.  Some of them doesn't suck, but they
Molle> break because there's no integrity checks, and package
Molle> maintainers can dump any kind of bizarre corrupt dependencies
Molle> they like into them.  That's how Gentoo works, for example.
Molle> Others have even more bizarre ways of breaking, again Gentoo as
Molle> an example requires the user to run a "switch to newer GCC"
Molle> command from time to time, otherwise random packages just start
Molle> breaking.

I tried gentoo a bunch of years ago and didn't like it, and it
certainly didn't give me the speedup it claimed it would have.  I've
been happy with Debian.  Thinking about Ubuntu more...

Molle> Surely it's possible to create a kernel interface where
Molle> processes can tell the kernel about which other processes
Molle> they'd like to outlive and which ones they'd like to get killed
Molle> before.

This has nothing to do with the kernel, it's all a userspace issue.  

Molle> The kernel could then coordinate the killing of processes in a
Molle> "shutdown" function, which the various distro's 'reboot' and
Molle> 'shutdown' scripts could call.

Again, userspace completely.  You're asking for policy in the kernel,
where it doesn't belong.  

Molle> And voila, that difficult task of assessing in which order to
Molle> do things is out of the hands of distros like Red Hat, and into
Molle> the hands of those people who actually make the binaries.

*bwah hah hah!*  And you think they'll get it right?  So what happens
when two packages, call them A and B, have a circular dependency on
each other?  Who wins then?

It's not as simple an issue as you think.  

John

^ permalink raw reply	[flat|nested] 49+ messages in thread

* Re: ext3 corruption
  2006-08-10 16:10               ` John Stoffel
@ 2006-08-10 19:10                 ` Molle Bestefich
  2006-08-11  8:06                   ` Helge Hafting
  2006-08-11 13:26                 ` Horst H. von Brand
  1 sibling, 1 reply; 49+ messages in thread
From: Molle Bestefich @ 2006-08-10 19:10 UTC (permalink / raw)
  To: John Stoffel; +Cc: linux-kernel

John Stoffel wrote:
> Molle> And voila, that difficult task of assessing in which order to
> Molle> do things is out of the hands of distros like Red Hat, and into
> Molle> the hands of those people who actually make the binaries.
>
> *bwah hah hah!*

No need to ridicule :-).
After all, I'm just saying that there's got to be a simpler, stabler
and more transparent way than to have all this logic sit in shell
scripts.

> So what happens when two packages, call them A and B,
> have a circular dependency on each other?  Who wins then?

They both get terminated at *exactly* the same time :-)... Nah, just kidding.

In that case, I imagine either
a) the system will log errors to syslog and pick a random order, or
b) the system will refuse to shutdown, politely returning back a
message to the user space tool that asked for the shutdown, saying
"there's an inconsistency in the ordering rules, please fix that
first".  They guy who tapped in "shutdown" would have to kill one of
the processes manually.  (And probably also upgrade the affected
software, or file a bug report, or whatever.)

I Googled for a similar software construct, and came upon the SCM in Windows.
Seems you can make Windows drivers and system services depend on each other.
In the case where there exists a circular dependency, the SCM refuses
to even start the affected services.

So there's a third possibility.

^ permalink raw reply	[flat|nested] 49+ messages in thread

* Re: ext3 corruption
  2006-08-10  0:08                   ` Duane Griffin
@ 2006-08-10 21:00                     ` Molle Bestefich
  2006-08-12 16:38                       ` Theodore Tso
  0 siblings, 1 reply; 49+ messages in thread
From: Molle Bestefich @ 2006-08-10 21:00 UTC (permalink / raw)
  To: Duane Griffin; +Cc: linux-kernel

[-- Attachment #1: Type: text/plain, Size: 726 bytes --]

Duane Griffin wrote:
> > If it doesn't take into account own changes, then the -n command is
> > unable to produce even a slightly accurate resemblence of what would
> > happen if I did a real run.
>
> It takes into account some of them (such as reading data from the
> backup superblock if it detects corruption). Others will be irrelevent
> for further operations. Many reports will be accurate, especially
> fatal ones. I consider that useful, YMMV.

I've attached the output of a -n run, let's get some facts on the table.

I would be very happy if someone knowledgeable would tell me something
useful about it.

I'm especially worried about the "70058 files, 39754 blocks used (0%)"
message at the end of the e2fsck run.

[-- Attachment #2: fs_check_out.bz2 --]
[-- Type: application/x-bzip2, Size: 39034 bytes --]

^ permalink raw reply	[flat|nested] 49+ messages in thread

* Re: ext3 corruption
  2006-08-10 19:10                 ` Molle Bestefich
@ 2006-08-11  8:06                   ` Helge Hafting
  0 siblings, 0 replies; 49+ messages in thread
From: Helge Hafting @ 2006-08-11  8:06 UTC (permalink / raw)
  To: Molle Bestefich; +Cc: John Stoffel, linux-kernel

Molle Bestefich wrote:
> John Stoffel wrote:
>> Molle> And voila, that difficult task of assessing in which order to
>> Molle> do things is out of the hands of distros like Red Hat, and into
>> Molle> the hands of those people who actually make the binaries.
>>
>> *bwah hah hah!*
>
> No need to ridicule :-).
> After all, I'm just saying that there's got to be a simpler, stabler
> and more transparent way than to have all this logic sit in shell
> scripts.
Shellscripts are both simple and stable.  Still, you are not the only
one dissatisfied with the sysv init program.

Check out initng:
http://initng.thinktux.net/index.php/Main_Page
It uses config files instead of scripts.  Well, the config files
may contain scripts but they don't have to.  Dependencies
are described in a simple way in these files. Another advantage,
services that don't depend on each other start/stop in parallel,
cutting boot time to 1/3 or so. (30s from powerup to X display is easy.)

I use it on one machine.  It is kind of "unfinished" in that it
don't have config files for every service out there, but it works
and you can make your own files if your service isn't supported yet.

When I looked for a init replacement, I found several other
alternatives too.  All trying different approaches, often trying to save
time by starting up things in parallel, but with very different
approaches to dependencies.  Not all were good.  Some made the mistake
of having the dependant software having to start up all the sw
it depends on.  Consider the maintenance nightmare adding and
removing packages from such a system...

Helge Hafting

^ permalink raw reply	[flat|nested] 49+ messages in thread

* Re: ext3 corruption
  2006-08-10 16:10               ` John Stoffel
  2006-08-10 19:10                 ` Molle Bestefich
@ 2006-08-11 13:26                 ` Horst H. von Brand
  2006-08-12  8:54                   ` Molle Bestefich
  1 sibling, 1 reply; 49+ messages in thread
From: Horst H. von Brand @ 2006-08-11 13:26 UTC (permalink / raw)
  To: John Stoffel; +Cc: Molle Bestefich, Linux Kernel Mailing List

John Stoffel <john@stoffel.org> wrote:
> >>>>> "Molle" == Molle Bestefich <molle.bestefich@gmail.com> writes:
> 
> Molle> I guess the problem is that I don't know a single Linux
> Molle> packaging system that actually works well enough to keep a
> Molle> system up to date at all times, and I don't have any free time
> Molle> to spend on reinstalling systems all the time.
> 
> Debian works the best in my experience.  Yum sucks rocks, it's a pain
> to configured and when it does pull in stuff, it pulls in *everything*
> that you don't want.  

Right. Its called "dependencies", and if you don't get those, chances are
it won't work. No Debian magic around that, I'm afraid. Sure, if all that
ever changes is minor tweaks...

> Not that apt-get is perfect, but it works well.  My main machine at
> home is an old Debian Stable install upgraded pretty much daily.
> Rarely do things break.  And rarely do you want packages updated
> without you thinking about it.  

My machine here runs Fedora rawhide (can't get more bleeding edge). Rarely
things break due to updating. The machines in the Lab here are Fedora,
almost never anything breaks. Servers are CentOS, I can't remember anything
ever breaking due to updates.

[...]

> I tried gentoo a bunch of years ago and didn't like it, and it
> certainly didn't give me the speedup it claimed it would have.

Gentoo folks deceive themselves into "at least twice as fast because I
compiled it myself"...

>                                                                 I've
> been happy with Debian.  Thinking about Ubuntu more...

[...]

> Molle> And voila, that difficult task of assessing in which order to
> Molle> do things is out of the hands of distros like Red Hat, and into
> Molle> the hands of those people who actually make the binaries.
> 
> *bwah hah hah!*  And you think they'll get it right?  So what happens
> when two packages, call them A and B, have a circular dependency on
> each other?  Who wins then?

The kernel people are certainly not infallible either. And there are cases
where the right order is A B C, and others in which it is C B A, and still
others where it doesn't matter. No way to get it right always.

> It's not as simple an issue as you think.  

Shoving it into the kernel certainly won't simplify it, quite the contrary
by making it less amenable to hand-tweaking.
-- 
Dr. Horst H. von Brand                   User #22616 counter.li.org
Departamento de Informatica                     Fono: +56 32 654431
Universidad Tecnica Federico Santa Maria              +56 32 654239
Casilla 110-V, Valparaiso, Chile                Fax:  +56 32 797513

^ permalink raw reply	[flat|nested] 49+ messages in thread

* Re: ext3 corruption
  2006-08-11 13:26                 ` Horst H. von Brand
@ 2006-08-12  8:54                   ` Molle Bestefich
  2006-08-12 10:31                     ` Molle Bestefich
  2006-08-17  1:27                     ` Horst H. von Brand
  0 siblings, 2 replies; 49+ messages in thread
From: Molle Bestefich @ 2006-08-12  8:54 UTC (permalink / raw)
  To: Linux Kernel Mailing List

Horst H. von Brand wrote:
> The kernel people are certainly not infallible either. And there are cases
> where the right order is A B C, and others in which it is C B A, and still
> others where it doesn't matter.

In the quite unlikely situation where that happens, you've obviously
got a piece of software which is broken dependency-wise.  Many of the
current schemes will fail to accommodate that too.

For example, no amount of moving the /etc/rc.d/rc6.d/K35smb script
around will fix that situation on Red Hat.

A solution to your example is to fix two of the three broken pieces of
software by splitting B into B1 and B2, and either A or C into their
components likewise:

A1 --> B1 --> C --> B2 --> A2

 -or-

C1 --> B1 --> A --> B2 --> C2

> No way to get it right always.

Your example did in no way prove that, so thus far that statement is not true.

> In any case, this is wildly off-topic for a list on /kernel/ development.
> Better locate a Linux User Group near you, look for mailing lists on running
> Linux, trawl Usenet for a group with acceptable signal/noise ratio.

I did mention that:

> > Anyway, let's all forget about the init scripts forthwith, they're
> > not really relevant for LKML I think.

And:

> > Concentrate on the ext3 issue :-).

And my next posting was about ext3 again.

^ permalink raw reply	[flat|nested] 49+ messages in thread

* Re: ext3 corruption
  2006-08-12  8:54                   ` Molle Bestefich
@ 2006-08-12 10:31                     ` Molle Bestefich
  2006-08-17  1:27                     ` Horst H. von Brand
  1 sibling, 0 replies; 49+ messages in thread
From: Molle Bestefich @ 2006-08-12 10:31 UTC (permalink / raw)
  To: Linux Kernel Mailing List

Molle Bestefich wrote:
> A solution to your example is to fix two of the three broken pieces of
> software by splitting B into B1 and B2, and either A or C into their
> components likewise:
>
> A1 --> B1 --> C --> B2 --> A2
>
>  -or-
>
> C1 --> B1 --> A --> B2 --> C2

To clarify, by the above I didn't necessarily mean "split one process
into two processes".  A kernel API where you can wait for a named
object or a subsystem to complete it's startup / shutdown would be
just as well.  Or simply waiting on (dis-)appearance of named files in
a dedicated directory named "boot_sequence" in sysfs, would be another
equally fine way to accomplish above scheme.

^ permalink raw reply	[flat|nested] 49+ messages in thread

* Re: ext3 corruption
  2006-08-10 21:00                     ` Molle Bestefich
@ 2006-08-12 16:38                       ` Theodore Tso
  2006-08-12 17:24                         ` Molle Bestefich
  0 siblings, 1 reply; 49+ messages in thread
From: Theodore Tso @ 2006-08-12 16:38 UTC (permalink / raw)
  To: Molle Bestefich; +Cc: Duane Griffin, linux-kernel

On Thu, Aug 10, 2006 at 11:00:07PM +0200, Molle Bestefich wrote:
> Duane Griffin wrote:
> >> If it doesn't take into account own changes, then the -n command is
> >> unable to produce even a slightly accurate resemblence of what would
> >> happen if I did a real run.
> >
> >It takes into account some of them (such as reading data from the
> >backup superblock if it detects corruption). Others will be irrelevent
> >for further operations. Many reports will be accurate, especially
> >fatal ones. I consider that useful, YMMV.
> 
> I've attached the output of a -n run, let's get some facts on the table.
> 
> I would be very happy if someone knowledgeable would tell me something
> useful about it.
> 
> I'm especially worried about the "70058 files, 39754 blocks used (0%)"
> message at the end of the e2fsck run.

OK, so it looks like the primary block group descriptors got trashed,
and so e2fsck had to fall back to using the backup block group
descriptors.  The summary information in the backup block group
descriptors is not backed up, for speed/performance reasons.  This is
not a problem, since that information can always be regenerated
trivially from the pass 5 information.  That's what all of "free
inodes/blocks/directories count wrong" messages in your log were all
about.

The 39754 block used (0%) is just because you were using -n and the
summary information is calculated from the filesystem summary data,
not from the pass5 count information (which was thrown away since you
were running -n and thus declined to fix the results).

I can imagine accepting a patch which sets a flag if any discrepancies
found in pass 5 are not fixed, and then if the summary information is
requested, to print a warning message indicating that the summary
information may not be correct.  But no, it's not worth it to take
into account changes that -n might make if the user had said "yes".
The complexities that would entail would be huge, and in fact as it is
e2fsck -n does give a fairly accurate report of what what is wrong
with the filesystem.  Is it 100% accurate?  No, but that was never the
goal of e2fsck -n.  If you want that, then use a dm-snapshot, and run
e2fsck on the snapshot....

						- Ted

^ permalink raw reply	[flat|nested] 49+ messages in thread

* Re: ext3 corruption
  2006-08-12 16:38                       ` Theodore Tso
@ 2006-08-12 17:24                         ` Molle Bestefich
  2006-08-12 21:47                           ` Theodore Tso
  0 siblings, 1 reply; 49+ messages in thread
From: Molle Bestefich @ 2006-08-12 17:24 UTC (permalink / raw)
  To: Theodore Tso; +Cc: linux-kernel, Duane Griffin

Theodore Tso wrote:
> > I'm especially worried about the "70058 files, 39754 blocks used (0%)"
> > message at the end of the e2fsck run.
>
> OK, so it looks like the primary block group descriptors got trashed,
> and so e2fsck had to fall back to using the backup block group
> descriptors.

Good to have backups.  It would be very useful to know whether e2fsck
contemplates writing those back as primary BGDs when it's done, but I
couldn't find that in the documentation.  Will it?

(Would be good to have the above information in the docs.  Perhaps in
a "what does this message mean?" section.)

(Such a section would also help a lot when confronted with the first
message: "Entry blah is a link to directory bluh. Clear? y/n".
Obviously I don't want to "clear" my data.  But why is e2fsck
confronting me with that question?  Is something wrong with it that it
should be cleared?)

> The summary information in the backup block group
> descriptors is not backed up, for speed/performance reasons.  This is
> not a problem, since that information can always be regenerated
> trivially from the pass 5 information.

Thanks for the information!
(Would be very helpful to have a copy/paste of the above in the docs too...)

> That's what all of "free inodes/blocks/directories count wrong"
> messages in your log were all about.

Ah, I'm relieved.  The sheer number of messages was an indicator to me
that e2fsck is doing something gruesomely wrong.

> The 39754 block used (0%) is just because you were using -n and the
> summary information is calculated from the filesystem summary data,
> not from the pass5 count information (which was thrown away since you
> were running -n and thus declined to fix the results).

Much relieved again, thanks.

I'm wondering why it even tries to use the corrupt information, instead of just:
* reconstructing it from scratch
* not asking the user?

That leaves me a little less relieved once again ;-).

> I can imagine accepting a patch which sets a flag if any discrepancies
> found in pass 5 are not fixed, and then if the summary information is
> requested,

Huh?  The user didn't request anything, it always prints.

> to print a warning message indicating that the summary
> information may not be correct.

Even not printing anything would probably be better than knowingly
printing wrong information...

> But no, it's not worth it to take
> into account changes that -n might make if the user had said "yes".
> The complexities that would entail would be huge, and in fact as it is
> e2fsck -n does give a fairly accurate report of what what is wrong
> with the filesystem.  Is it 100% accurate?  No, but that was never the
> goal of e2fsck -n.  If you want that, then use a dm-snapshot, and run
> e2fsck on the snapshot....

Agreed.  Running a r/w e2fsck on some kind of overlay would be the way
to implement a more useful (for me anyway) version of -n.

But I think dm-snapshot is useless in this case because:
 * It must be configured before MD is configured and the filesystem is
created, which I haven't done on this box.

And generally because:
 * It's rather good at corrupting the filesystems you store on top of
it *itself*...
 * Either you have to create snapshots on devices just as big as the
ones being snapshotted, or you'll have to live with the snapshot
failing any time because it's full.  There's no good management
framework to help you manage the full/failing situations either.

Thanks a lot for the information!

I take it that it's safe to run e2fsck on the filesystem, then...

^ permalink raw reply	[flat|nested] 49+ messages in thread

* Re: ext3 corruption
  2006-08-12 17:24                         ` Molle Bestefich
@ 2006-08-12 21:47                           ` Theodore Tso
  2006-08-13 19:21                             ` Molle Bestefich
  0 siblings, 1 reply; 49+ messages in thread
From: Theodore Tso @ 2006-08-12 21:47 UTC (permalink / raw)
  To: Molle Bestefich; +Cc: linux-kernel, Duane Griffin

On Sat, Aug 12, 2006 at 07:24:06PM +0200, Molle Bestefich wrote:
> 
> Good to have backups.  It would be very useful to know whether e2fsck
> contemplates writing those back as primary BGDs when it's done, but I
> couldn't find that in the documentation.  Will it?

Yes, it will.

> (Would be good to have the above information in the docs.  Perhaps in
> a "what does this message mean?" section.)

Well, if someone would like to volunteer to be a technical writer,
that would be great......

> (Such a section would also help a lot when confronted with the first
> message: "Entry blah is a link to directory bluh. Clear? y/n".
> Obviously I don't want to "clear" my data.  But why is e2fsck
> confronting me with that question?  Is something wrong with it that it
> should be cleared?)

Basically, there are two modes that e2fsck can run in.  What the boot
scripts use is called "preen" mode, which will automatically fix
"safe" things, and stop if there are anything where the user
administrator might need to need to exercise discretion, or where the
system administrator should know that there might be something that he
or she needs to clean up (like orphaned inodes getting linked into the
lost+found directory, for example).

In the normal mode, e2fsck asks permission before it does anything.
In general, the default answer is "safe", but there are times when a=
filesystem expert can do better by declining to fix a problem and then
using debugfs afterwards to try to recover data before running e2fsck
a second time to completely clear out all of the problems.

If you don't like that, you can always run with e2fsck -y, which wil
clause e2fsck to never ask permission before going ahead and trying
its best to fix the problem.

> >The summary information in the backup block group
> >descriptors is not backed up, for speed/performance reasons.  This is
> >not a problem, since that information can always be regenerated
> >trivially from the pass 5 information.
> 
> Thanks for the information!
> (Would be very helpful to have a copy/paste of the above in the docs too...)

Well, the e2fsck man page isn't intended to be a tutorial.  If someone
wants to volunteer to write an extended introduction to how e2fsck
works and what all of the messages mean, I'd certainly be willing to
work with that person...  So if you're willing to volunteer or willing
to chip in to pay for a technical writer, let me know....

> I'm wondering why it even tries to use the corrupt information, instead of 
> just:
> * reconstructing it from scratch
> * not asking the user?

It did reconstruct it from scratch; that's what pass 5 is all about.
It just didn't store it in the block group descriptors, because of the
-n option.  

> >I can imagine accepting a patch which sets a flag if any discrepancies
> >found in pass 5 are not fixed, and then if the summary information is
> >requested,
> 
> Huh?  The user didn't request anything, it always prints.

The summary information is only printed when the -v option is given,
and that's about all the -v option does.  The summary information is
not the primary raison d'etre for e2fsck, so I'm not going to waste a
lot of time trying to keep two copies of the information so that the
information can be correct in the -nv case.  That's just soooooo
unimportant, and most users don't use the -v option anyway.

> >with the filesystem.  Is it 100% accurate?  No, but that was never the
> >goal of e2fsck -n.  If you want that, then use a dm-snapshot, and run
> >e2fsck on the snapshot....
> 
> Agreed.  Running a r/w e2fsck on some kind of overlay would be the way
> to implement a more useful (for me anyway) version of -n.
> 
> But I think dm-snapshot is useless in this case because....

Well, I have the following project listed in the TODO file for
e2fsprogs:

   4) Create a new I/O manager (i.e., test_io.c, unix_io.c, et.al.) which
   layers on top of an existing I/O manager which provides copy-on-write
   functionality.  This COW I/O manager takes will take two open I/O
   managers, call them "base" and "changed".  The "base" I/O manager is
   opened read/only, so any changes are written instead to the "changed"
   I/O manager, in a compact, non-sparse format containing the intended
   modification to the "base" filesystem.

   This will allow resize2fs to figure out what changes need to made to
   extend a filesystem, or expand the size of inodes in the inode table,
   and the changes can be pushed the filesystem in one fell swoop.  (If
   the system crashes; the program which runs the "changed" file can be
   re-run, much like a journal replay.  My assumption is that the COW
   file will contain the filesystem UUID in a the COW superblock, and the
   COW file will be stored in some place such as /var/state/e2fsprogs,
   with an init.d file to automate the replay so we can recover cleanly
   from a crash during the resize2fs process.)

           Difficulty: Medium      Priority: Medium

Patches to implement this would be gratefully accepted....

(This is open source, which means if people who have the bad manners
to kvetch that volunteers have done all of this free work for them
haven't done $FOO will be gently reminded that patches to implement
$FOO are always welcome.  :-)

						- Ted

^ permalink raw reply	[flat|nested] 49+ messages in thread

* Re: ext3 corruption
  2006-08-12 21:47                           ` Theodore Tso
@ 2006-08-13 19:21                             ` Molle Bestefich
  2006-08-14  3:23                               ` Kyle Moffett
  2006-08-14 15:34                               ` Theodore Tso
  0 siblings, 2 replies; 49+ messages in thread
From: Molle Bestefich @ 2006-08-13 19:21 UTC (permalink / raw)
  To: Theodore Tso; +Cc: linux-kernel

Theodore Tso wrote:
> Well, the e2fsck man page isn't intended to be a tutorial.  If someone
> wants to volunteer to write an extended introduction to how e2fsck
> works and what all of the messages mean, I'd certainly be willing to
> work with that person...  So if you're willing to volunteer

I'd love to help others with the same problem that I have.  I know
basically nothing of e2fsck though, and I don't have time to research
and write a whole tutorial.  Maybe there's a wiki somewhere where I
can start something out with a structure and some information
regarding the stuff I've seen?

> or willing to chip in to pay for a technical writer, let me know....

What kind of economic scale where you thinking about?

> (This is open source, which means if people who have the bad manners
> to kvetch that volunteers have done all of this free work for them
> haven't done $FOO will be gently reminded that patches to implement
> $FOO are always welcome.  :-)

OTOH, the open source community rigorously PR Linux as an alternative
to Windows.

While the above attitude is fine by me, you're going to have to expect
to see some sad faces from Windows users when they create a filesystem
on a loop device and don't realize that the loop driver destroys
journaling expectancies and results in all their photos and home
videos going down the drain, all because nobody implemented a simple
"warning!" message in the software.

(Or whatever.  Lots of similar examples exist to show you that the "no
warranty: you use our software, you learn to hack it to do what you
want yourself or it's your own fault" argument is fallacious.)

^ permalink raw reply	[flat|nested] 49+ messages in thread

* Re: ext3 corruption
  2006-08-13 19:21                             ` Molle Bestefich
@ 2006-08-14  3:23                               ` Kyle Moffett
  2006-08-14 15:34                               ` Theodore Tso
  1 sibling, 0 replies; 49+ messages in thread
From: Kyle Moffett @ 2006-08-14  3:23 UTC (permalink / raw)
  To: Molle Bestefich; +Cc: Theodore Tso, linux-kernel

On Aug 13, 2006, at 15:21:24, Molle Bestefich wrote:
> Theodore Tso wrote:
>> (This is open source, which means if people who have the bad  
>> manners to kvetch that volunteers have done all of this free work  
>> for them haven't done $FOO will be gently reminded that patches to  
>> implement $FOO are always welcome.  :-)
>
> OTOH, the open source community rigorously PR Linux as an  
> alternative to Windows.

Some people do; some people believe it's still not ready (for the  
desktop environment where Windows currently has majority  
marketshare).  I run a fileserver for my parents and wouldn't use  
anything other than Linux/OpenLDAP/Samba/device-mapper/mdadm on fully  
open-spec hardware, but I wouldn't expect them to do anything other  
than call me when it breaks and maybe follow a few specific  
instructions for getting it network-accessible again via server- 
management chip.  This is all really easy for _me_ to manage with  
Linux on good server hardware, but that's not something I'd think a  
non-admin could handle on their own.  And for 3D graphics, GUI  
programs, etc, IMHO Linux is still miles from being where it needs to  
be to really compete.

> While the above attitude is fine by me, you're going to have to  
> expect to see some sad faces from Windows users when they create a  
> filesystem on a loop device and don't realize that the loop driver  
> destroys journaling expectancies and results in all their photos  
> and home videos going down the drain, all because nobody  
> implemented a simple "warning!" message in the software.

This is really what distros are expected to do (at least in the  
current environment).  The major development groups don't have the  
financial and legal backing to be able to certify reliability and  
support for *any* user, let alone your average Joe User who's used to  
Windows and *clicky*-*clicky*-ing his way around the UI.  Eventually  
there will be enough vendors selling Linux-based systems that the UI- 
polish patches will be developed as rapidly as the fundamental  
underlying infrastructure, but we're not there yet.  Ubuntu and such  
are paving the way for future even-better-than-mac vertical UI  
integration but we have a lot of UI infrastructure (especially 3d  
support in X) that needs fixing first.  IMHO Linux is still very much  
for hobbyists, server administrators, and other people who have at  
least a modicum of computer problem-solving skills.

> (Or whatever.  Lots of similar examples exist to show you that the  
> "no warranty: you use our software, you learn to hack it to do what  
> you want yourself or it's your own fault" argument is fallacious.)

That kind of warranty is a hobbyist-type warranty.  Some companies  
invest money to build upon that and provide server-admin-type or end- 
user-type warranties, but such support costs money and time which  
most upstream developers don't have.

Cheers,
Kyle Moffett

^ permalink raw reply	[flat|nested] 49+ messages in thread

* Re: ext3 corruption
  2006-08-13 19:21                             ` Molle Bestefich
  2006-08-14  3:23                               ` Kyle Moffett
@ 2006-08-14 15:34                               ` Theodore Tso
  2006-08-14 17:21                                 ` Molle Bestefich
  1 sibling, 1 reply; 49+ messages in thread
From: Theodore Tso @ 2006-08-14 15:34 UTC (permalink / raw)
  To: Molle Bestefich; +Cc: linux-kernel

On Sun, Aug 13, 2006 at 09:21:24PM +0200, Molle Bestefich wrote:
> I'd love to help others with the same problem that I have.  I know
> basically nothing of e2fsck though, and I don't have time to research
> and write a whole tutorial.  Maybe there's a wiki somewhere where I
> can start something out with a structure and some information
> regarding the stuff I've seen?

There isn't yet, but kernel.org is supposed to be setting a wiki soon.
So hopefully we can get a wiki started there.

> >or willing to chip in to pay for a technical writer, let me know....
> 
> What kind of economic scale where you thinking about?

Well, the Technical Advisory Board of the OSDL (James Bottomly is the
chair, other folks include Greg K-H, Randy Dunlap, and others
including myself) is trying to fund a technical writer, mostly for
kernel documentation, as well as other kernel projects.  OSDL is
mainly set up to solicit monies from companies, but it might be
possible to get something setup so we can accept donations from
individuals.

> >(This is open source, which means if people who have the bad manners
> >to kvetch that volunteers have done all of this free work for them
> >haven't done $FOO will be gently reminded that patches to implement
> >$FOO are always welcome.  :-)
> 
> OTOH, the open source community rigorously PR Linux as an alternative
> to Windows.

Some people do, but not all.  It also depends on what usage model you
are looking at.  If it's kiosks or fixed-function Windows facilities
(i.e., used by travel agengts, receptionists, cash registers), then
Linux would certainly be ready now, and it's probably easier to use
Linux than Windows for those scenarios.  But for the "knowledge
worker" who is a power Windows user who regular exchanges Microsoft
Office files with others and who needs 100% Office compatibility?  Not
hardly!

> While the above attitude is fine by me, you're going to have to expect
> to see some sad faces from Windows users when they create a filesystem
> on a loop device and don't realize that the loop driver destroys
> journaling expectancies and results in all their photos and home
> videos going down the drain, all because nobody implemented a simple
> "warning!" message in the software.

To be fair, there are plenty of other dangerous things that you can do
with Windows that don't have warning messages pop-up.  And using the
loop driver is of a complexity which is higher than what you would
expect of a typical Windows user.  You might as well complain that
Linux doesn't give a warning message when you run some command like
"rm -rf /", or "dd if=/dev/null of=/dev/hda".  I'm sure there are
similar commands (probably involving regedit :-) that are just as
dangerous from the Windows cmd.exe window.....

							- Ted

^ permalink raw reply	[flat|nested] 49+ messages in thread

* Re: ext3 corruption
  2006-08-14 15:34                               ` Theodore Tso
@ 2006-08-14 17:21                                 ` Molle Bestefich
  0 siblings, 0 replies; 49+ messages in thread
From: Molle Bestefich @ 2006-08-14 17:21 UTC (permalink / raw)
  To: Theodore Tso, linux-kernel

Theodore Tso wrote:
> To be fair, there are plenty of other dangerous things that you can do
> with Windows that don't have warning messages pop-up.  And using the
> loop driver is of a complexity which is higher than what you would
> expect of a typical Windows user.  You might as well complain that
> Linux doesn't give a warning message when you run some command like
> "rm -rf /", or "dd if=/dev/null of=/dev/hda".  I'm sure there are
> similar commands (probably involving regedit :-) that are just as
> dangerous from the Windows cmd.exe window.....

Hardly comparable..

"rm", "dd if=/dev/null", "format c:" is meant to nuke your harddrive.
The loop driver just does it as a nasty side effect of a stinky
implementation.

^ permalink raw reply	[flat|nested] 49+ messages in thread

* Re: ext3 corruption
  2006-08-12  8:54                   ` Molle Bestefich
  2006-08-12 10:31                     ` Molle Bestefich
@ 2006-08-17  1:27                     ` Horst H. von Brand
  2006-08-17 13:46                       ` Molle Bestefich
  1 sibling, 1 reply; 49+ messages in thread
From: Horst H. von Brand @ 2006-08-17  1:27 UTC (permalink / raw)
  To: Molle Bestefich; +Cc: Linux Kernel Mailing List

Molle Bestefich <molle.bestefich@gmail.com> wrote:
> Horst H. von Brand wrote:

[...]

> > The kernel people are certainly not infallible either. And there are cases
> > where the right order is A B C, and others in which it is C B A, and still
> > others where it doesn't matter.

> In the quite unlikely situation where that happens, you've obviously
> got a piece of software which is broken dependency-wise.  Many of the
> current schemes will fail to accommodate that too.

It isn't broken /software/, it is /different setups/.

> For example, no amount of moving the /etc/rc.d/rc6.d/K35smb script
> around will fix that situation on Red Hat.

What situation?
-- 
Dr. Horst H. von Brand                   User #22616 counter.li.org
Departamento de Informatica                     Fono: +56 32 654431
Universidad Tecnica Federico Santa Maria              +56 32 654239
Casilla 110-V, Valparaiso, Chile                Fax:  +56 32 797513

^ permalink raw reply	[flat|nested] 49+ messages in thread

* Re: ext3 corruption
  2006-08-17  1:27                     ` Horst H. von Brand
@ 2006-08-17 13:46                       ` Molle Bestefich
  0 siblings, 0 replies; 49+ messages in thread
From: Molle Bestefich @ 2006-08-17 13:46 UTC (permalink / raw)
  To: Horst H. von Brand; +Cc: Linux Kernel Mailing List

Horst H. von Brand wrote:
> > > The kernel people are certainly not infallible either. And there are cases
> > > where the right order is A B C, and others in which it is C B A, and still
> > > others where it doesn't matter.
>
> > In the quite unlikely situation where that happens, you've obviously
> > got a piece of software which is broken dependency-wise.  Many of the
> > current schemes will fail to accommodate that too.
>
> It isn't broken /software/, it is /different setups/.

It's broken software.

> > For example, no amount of moving the /etc/rc.d/rc6.d/K35smb script
> > around will fix that situation on Red Hat.
>
> What situation?

The situation you outlined, where A can depend on B, which can depend
on C, but in another usage scenario C can depend on B which can depend
on A.

^ permalink raw reply	[flat|nested] 49+ messages in thread

* Re: ext3 corruption
  2006-08-10 12:19               ` Helge Hafting
  2006-08-10 13:00                 ` Molle Bestefich
@ 2006-09-24  8:56                 ` Molle Bestefich
  2006-09-25 12:27                   ` Helge Hafting
  1 sibling, 1 reply; 49+ messages in thread
From: Molle Bestefich @ 2006-09-24  8:56 UTC (permalink / raw)
  To: linux-kernel; +Cc: Helge Hafting

I wrote:
> I have a ~1TB filesystem that failed to mount today, the message is:
>
> EXT3-fs error (device loop0): ext3_check_descriptors: Block bitmap for
> group 2338 not in group (block 1607003381)!
> EXT3-fs: group descriptors corrupted !
>
> Yesterday it worked flawlessly.

Helge Hafting wrote:
> > And voila, that difficult task of assessing in which order to do
> > things is out of the hands of distros like Red Hat, and into the
> > hands of those people who actually make the binaries.
>
> Not so easy.  You do not want to shut down md devices because
> samba is using them. Someone else may run samba on a single
> harddisk and also have some md-devices that they take down
> and bring up a lot.  So having samba generally depend on md doesn't
> work.  Your setup need it, others may have different needs.

I've looked hard at things and just found that maybe it's not the init
order that's to blame..

It seems that unmounting the filesystem fails with a "device busy" error.
I'm not sure why there's still open files on the device, but perhaps a
remote user is copying a file or some such (likely).

Anyway, the system is shutting down, so it should just forcefully
unmount the device, but it doesn't.
The halt script tries "umount" three times, which all fail with:
"device is busy".
It then actually tries "umount -f" three times, which all fail with
"Device or resource busy"
At which point the halt script turns off the machine and the
filesystem is ruined.

How to fix forceful unmount so it works?

^ permalink raw reply	[flat|nested] 49+ messages in thread

* Re: ext3 corruption
  2006-09-24  8:56                 ` Molle Bestefich
@ 2006-09-25 12:27                   ` Helge Hafting
  2006-10-02  2:40                     ` Molle Bestefich
  0 siblings, 1 reply; 49+ messages in thread
From: Helge Hafting @ 2006-09-25 12:27 UTC (permalink / raw)
  To: Molle Bestefich; +Cc: linux-kernel

Molle Bestefich wrote:
> I wrote:
>> I have a ~1TB filesystem that failed to mount today, the message is:
>>
>> EXT3-fs error (device loop0): ext3_check_descriptors: Block bitmap for
>> group 2338 not in group (block 1607003381)!
>> EXT3-fs: group descriptors corrupted !
>>
>> Yesterday it worked flawlessly.
>
> Helge Hafting wrote:
>> > And voila, that difficult task of assessing in which order to do
>> > things is out of the hands of distros like Red Hat, and into the
>> > hands of those people who actually make the binaries.
>>
>> Not so easy.  You do not want to shut down md devices because
>> samba is using them. Someone else may run samba on a single
>> harddisk and also have some md-devices that they take down
>> and bring up a lot.  So having samba generally depend on md doesn't
>> work.  Your setup need it, others may have different needs.
>
> I've looked hard at things and just found that maybe it's not the init
> order that's to blame..
>
> It seems that unmounting the filesystem fails with a "device busy" error.
> I'm not sure why there's still open files on the device, but perhaps a
> remote user is copying a file or some such (likely).
That is solvable by shutting down remote operations first.
So stop samba (or nfs or whatever) before attempting to umount.
> Anyway, the system is shutting down, so it should just forcefully
> unmount the device, but it doesn't.
> The halt script tries "umount" three times, which all fail with:
> "device is busy".
> It then actually tries "umount -f" three times, which all fail with
> "Device or resource busy"
> At which point the halt script turns off the machine and the
> filesystem is ruined.
>
> How to fix forceful unmount so it works?
I don't know, other than researching what filesystems support
forced umount and use one of those. Complain to the vendor or maintainer
of your particular filesystem.

However, you can usually find out why some file is open. Try
umount yourself, when it doesn't work, use "lsof" to see
what file is open. Then figure out who or what is keeping it open.
To debug a shutdown problem, consider putting "lsof >> logfile"
in your shutdown script.

Not a solution but a workaround: Run "sync" before shutdown.
(Stick it in some script.)
Now, all data in filesystem caches will be written to disk before power 
is lost.
This isn't perfect, but filesystem damage is greatly minimized and
often avoided completely.  Useful while waiting for a better solution.

The real solution is to set things up so unforced umount works.
This is normally possible to do.


Helge Hafting

^ permalink raw reply	[flat|nested] 49+ messages in thread

* Re: ext3 corruption
  2006-09-25 12:27                   ` Helge Hafting
@ 2006-10-02  2:40                     ` Molle Bestefich
  2006-10-02  3:24                       ` Gene Heskett
  0 siblings, 1 reply; 49+ messages in thread
From: Molle Bestefich @ 2006-10-02  2:40 UTC (permalink / raw)
  To: linux-kernel

Helge Hafting wrote:
> [snip]

Well, that was unproductive :-).

If anyone knows how to make forced unmounting work, hints would be
greatly appreciated.

To reiterate:
The distro halt script tries "umount -f" three times, which all fail with
"Device or resource busy".

^ permalink raw reply	[flat|nested] 49+ messages in thread

* Re: ext3 corruption
  2006-10-02  2:40                     ` Molle Bestefich
@ 2006-10-02  3:24                       ` Gene Heskett
  2006-10-02  6:50                         ` Kyle Moffett
  0 siblings, 1 reply; 49+ messages in thread
From: Gene Heskett @ 2006-10-02  3:24 UTC (permalink / raw)
  To: linux-kernel

On Sunday 01 October 2006 22:40, Molle Bestefich wrote:
>Helge Hafting wrote:
>> [snip]
>
>Well, that was unproductive :-).
>
>If anyone knows how to make forced unmounting work, hints would be
>greatly appreciated.
>
>To reiterate:
>The distro halt script tries "umount -f" three times, which all fail with
>"Device or resource busy".

Me too.
I'm getting those messages from the NFS stuff at shutdown time, with NO NFS 
shares active.  I have had them for years.  But the reboot goes on 
eventually, and apparently without harm.

-- 
Cheers, Gene
"There are four boxes to be used in defense of liberty:
 soap, ballot, jury, and ammo. Please use in that order."
-Ed Howdershelt (Author)
Yahoo.com and AOL/TW attorneys please note, additions to the above
message by Gene Heskett are:
Copyright 2006 by Maurice Eugene Heskett, all rights reserved.

^ permalink raw reply	[flat|nested] 49+ messages in thread

* Re: ext3 corruption
  2006-10-02  3:24                       ` Gene Heskett
@ 2006-10-02  6:50                         ` Kyle Moffett
  0 siblings, 0 replies; 49+ messages in thread
From: Kyle Moffett @ 2006-10-02  6:50 UTC (permalink / raw)
  To: Gene Heskett; +Cc: linux-kernel

On Oct 01, 2006, at 23:24:48, Gene Heskett wrote:
> On Sunday 01 October 2006 22:40, Molle Bestefich wrote:
>> To reiterate:
>> The distro halt script tries "umount -f" three times, which all  
>> fail with
>> "Device or resource busy".
>
> Me too.
> I'm getting those messages from the NFS stuff at shutdown time,  
> with NO NFS
> shares active.  I have had them for years.  But the reboot goes on
> eventually, and apparently without harm.

What causes problems on _all_ of my softraid boxes is that without a  
whole bunch of pivot_root magic in the shutdown code to switch to a  
tmpfs and unmount my lvm-on-md-on-sata stuff, it's impossible to get  
the kernel to stop devices cleanly.  I get all sorts of messages from  
the kernel about trying to stop MD devices and not being able to  
_after_ reboot is called, even though at that point it should just  
forcibly kill all userspace, unmount all filesystems, and deconstruct  
the MD/DM device tree.  I see no reason why a successful shutdown or  
reboot call should _ever_ leave the disks in an inconsistent state.

Cheers,
Kyle Moffett

^ permalink raw reply	[flat|nested] 49+ messages in thread

end of thread, other threads:[~2006-10-02  6:50 UTC | newest]

Thread overview: 49+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2006-07-13 20:32 ext3 corruption Molle Bestefich
2006-08-08 23:47 ` Molle Bestefich
2006-08-09  1:33   ` Sergio Monteiro Basto
2006-08-09 10:36   ` Molle Bestefich
2006-08-09 11:33   ` linux-os (Dick Johnson)
2006-08-09 15:22     ` Molle Bestefich
2006-08-09 15:38       ` Michael Loftis
2006-08-09 18:28         ` Molle Bestefich
2006-08-09 18:41           ` Mws
2006-08-09 20:17           ` Duane Griffin
2006-08-09 20:47             ` Molle Bestefich
     [not found]               ` <e9e943910608091527t3b88da7eo837f6adc1e1e6f98@mail.gmail.com>
2006-08-09 23:09                 ` Molle Bestefich
2006-08-10  0:08                   ` Duane Griffin
2006-08-10 21:00                     ` Molle Bestefich
2006-08-12 16:38                       ` Theodore Tso
2006-08-12 17:24                         ` Molle Bestefich
2006-08-12 21:47                           ` Theodore Tso
2006-08-13 19:21                             ` Molle Bestefich
2006-08-14  3:23                               ` Kyle Moffett
2006-08-14 15:34                               ` Theodore Tso
2006-08-14 17:21                                 ` Molle Bestefich
2006-08-10  3:06           ` Jim Crilly
2006-08-10  9:48             ` Molle Bestefich
2006-08-10 11:41               ` linux-os (Dick Johnson)
2006-08-10 12:21                 ` Molle Bestefich
2006-08-10 12:19               ` Helge Hafting
2006-08-10 13:00                 ` Molle Bestefich
2006-08-10 14:40                   ` gmu 2k6
2006-09-24  8:56                 ` Molle Bestefich
2006-09-25 12:27                   ` Helge Hafting
2006-10-02  2:40                     ` Molle Bestefich
2006-10-02  3:24                       ` Gene Heskett
2006-10-02  6:50                         ` Kyle Moffett
2006-08-10 16:10               ` John Stoffel
2006-08-10 19:10                 ` Molle Bestefich
2006-08-11  8:06                   ` Helge Hafting
2006-08-11 13:26                 ` Horst H. von Brand
2006-08-12  8:54                   ` Molle Bestefich
2006-08-12 10:31                     ` Molle Bestefich
2006-08-17  1:27                     ` Horst H. von Brand
2006-08-17 13:46                       ` Molle Bestefich
2006-08-10  8:32           ` Bernd Petrovitsch
2006-08-10  7:44       ` Denis Vlasenko
     [not found] <Pine.LNX.4.33.0207121337500.8654-100000@coffee.psychology.mcmaster.ca>
2002-07-12 19:05 ` Alec Smith
2002-07-12 20:11   ` Andrew Morton
  -- strict thread matches above, loose matches on Subject: below --
2002-07-12 15:32 Alec Smith
2002-07-12 15:52 ` Russell King
2002-07-12 19:51   ` Andrew Morton
2002-07-12 16:02 ` Alan Cox

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox