public inbox for linux-kernel@vger.kernel.org
 help / color / mirror / Atom feed
* O_TRUNC problem on a full filesystem
@ 2001-05-23  6:13 Manas Garg
  2001-05-23  9:16 ` OT: " Helge Hafting
  2001-05-23  9:55 ` Andrew Morton
  0 siblings, 2 replies; 10+ messages in thread
From: Manas Garg @ 2001-05-23  6:13 UTC (permalink / raw)
  To: linux-kernel

I am not sure if it should be classified as a bug, that's why I am calling it a
problem. Here is the description:

If the filesystem is full, obviously, I can't write anything to that any
longer. But if I open a file with O_TRUNC flag set, the file will be truncated.
Any program that opens a file with O_TRUNC flag set, wishes to write something
there later on. But because the filesystem is full, it can't write. It would
definitely happen if the file is not huge (TESTED). But I am not sure what
happens if the file _is_ huge (NOT TESTED).

I lost configuration files of a few programs this way. While exiting, they
opened their conf files with O_TRUNC flag but couldn't write anything there.

The kernel in use is 2.4.4.

	--manas

^ permalink raw reply	[flat|nested] 10+ messages in thread

* OT: O_TRUNC problem on a full filesystem
  2001-05-23  6:13 O_TRUNC problem on a full filesystem Manas Garg
@ 2001-05-23  9:16 ` Helge Hafting
  2001-05-23  9:55 ` Andrew Morton
  1 sibling, 0 replies; 10+ messages in thread
From: Helge Hafting @ 2001-05-23  9:16 UTC (permalink / raw)
  To: Manas Garg; +Cc: linux-kernel

Manas Garg wrote:
> 
> I am not sure if it should be classified as a bug, that's why I am calling it a
> problem. Here is the description:
Not a bug.

> If the filesystem is full, obviously, I can't write anything to that any
> longer. But if I open a file with O_TRUNC flag set, the file will be truncated.
> Any program that opens a file with O_TRUNC flag set, wishes to write something
> there later on. But because the filesystem is full, it can't write. It would
> definitely happen if the file is not huge (TESTED). But I am not sure what
> happens if the file _is_ huge (NOT TESTED).

Truncating the file frees up the space it took, and will allow writing.
but someone else may grab the space before you - there is no guarantee
on a multi-process system.  

Note that the last few % of the disk is reserved for root.  So it will
be
"full" for users even if there is a few % left.  Root may have filled
the disk
well into the reserved part (logfiles etc.)  A user deleting a small
file
will free up some space, but the fs may still be overfull, i.e. less
than those few % in free space.  This is probably what happened to you.

> I lost configuration files of a few programs this way. While exiting, they
> opened their conf files with O_TRUNC flag but couldn't write anything there.

Ill-written programs - complain to the maintainers.
write a new config file with a different name first, 
then rename it onto the old name.  This fails gracefully
on a full fs, you get to keep the old file.

Or have a fixed-size config file and update the
contents in place.

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: O_TRUNC problem on a full filesystem
  2001-05-23  6:13 O_TRUNC problem on a full filesystem Manas Garg
  2001-05-23  9:16 ` OT: " Helge Hafting
@ 2001-05-23  9:55 ` Andrew Morton
  2001-05-24 11:16   ` Stephen C. Tweedie
  1 sibling, 1 reply; 10+ messages in thread
From: Andrew Morton @ 2001-05-23  9:55 UTC (permalink / raw)
  To: Manas Garg; +Cc: linux-kernel

Manas Garg wrote:
> 
> I am not sure if it should be classified as a bug, that's why I am calling it a
> problem. Here is the description:
> 

It works fine with ext3 :)

That's because ext3 has per-file block preallocation
disabled.

When you truncated your file, the blocks remained preallocated
on behalf of the file, and were hence considered "used".  For
some reason, a subsequent attempt to allocate blocks for the
same file failed to use that file's preallocated blocks.

It's an arguable bug in ext2 and, as you've seen, the consequences
are bad.  Your applications _are_ a little bit buggy,
because they can't assume that just because they
truncated the file, that space will remain available to
them.

Maybe someone would like to wade through screenfuls of icky
single-char identifiers and fix it?

-

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: O_TRUNC problem on a full filesystem
  2001-05-23  9:55 ` Andrew Morton
@ 2001-05-24 11:16   ` Stephen C. Tweedie
  2001-05-24 11:28     ` Andrew Morton
  0 siblings, 1 reply; 10+ messages in thread
From: Stephen C. Tweedie @ 2001-05-24 11:16 UTC (permalink / raw)
  To: Andrew Morton; +Cc: Manas Garg, linux-kernel, Stephen Tweedie

On Wed, May 23, 2001 at 07:55:48PM +1000, Andrew Morton wrote:

> When you truncated your file, the blocks remained preallocated
> on behalf of the file, and were hence considered "used".  For
> some reason, a subsequent attempt to allocate blocks for the
> same file failed to use that file's preallocated blocks.

Nope.  ext2_truncate() calls ext2_discard_prealloc() to fix this up.
Both 2.2 and 2.4 do this correctly.

Cheers,
 Stephen

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: O_TRUNC problem on a full filesystem
  2001-05-24 11:16   ` Stephen C. Tweedie
@ 2001-05-24 11:28     ` Andrew Morton
  2001-05-24 17:24       ` Andreas Dilger
  0 siblings, 1 reply; 10+ messages in thread
From: Andrew Morton @ 2001-05-24 11:28 UTC (permalink / raw)
  To: Stephen C. Tweedie; +Cc: Manas Garg, linux-kernel

"Stephen C. Tweedie" wrote:
> 
> On Wed, May 23, 2001 at 07:55:48PM +1000, Andrew Morton wrote:
> 
> > When you truncated your file, the blocks remained preallocated
> > on behalf of the file, and were hence considered "used".  For
> > some reason, a subsequent attempt to allocate blocks for the
> > same file failed to use that file's preallocated blocks.
> 
> Nope.  ext2_truncate() calls ext2_discard_prealloc() to fix this up.
> Both 2.2 and 2.4 do this correctly.

But the problem goes away when you disable EXT2_PREALLOCATE.
I tested it.

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: O_TRUNC problem on a full filesystem
  2001-05-24 11:28     ` Andrew Morton
@ 2001-05-24 17:24       ` Andreas Dilger
  2001-05-24 18:15         ` Stephen C. Tweedie
  2001-05-25  0:24         ` Andrew Morton
  0 siblings, 2 replies; 10+ messages in thread
From: Andreas Dilger @ 2001-05-24 17:24 UTC (permalink / raw)
  To: Andrew Morton; +Cc: Stephen C. Tweedie, Manas Garg, linux-kernel

Andrew writes:
> "Stephen C. Tweedie" wrote:
> > On Wed, May 23, 2001 at 07:55:48PM +1000, Andrew Morton wrote:
> > > When you truncated your file, the blocks remained preallocated
> > > on behalf of the file, and were hence considered "used".  For
> > > some reason, a subsequent attempt to allocate blocks for the
> > > same file failed to use that file's preallocated blocks.
> > 
> > Nope.  ext2_truncate() calls ext2_discard_prealloc() to fix this up.
> > Both 2.2 and 2.4 do this correctly.
> 
> But the problem goes away when you disable EXT2_PREALLOCATE.
> I tested it.

Are you sure that a truncated file will re-use the same truncated blocks,
but not the preallocated ones?  I can imagine not re-using all of the data
blocks within a single transaction, but it would be odd if the preallocated
blocks are treated differently.

How have you done the ext3 preallocation code?  One way to do it would be
to only mark the blocks as used in the in-memory copy of the block bitmap
and not write that to disk (we keep 2 copies of the block bitmap, IIRC).
That way you don't need to do anything fancy at recovery time.

Did you ever benchmark ext2 with and without preallocation to see if it
made any difference?  No point in doing extra work if there is no benefit.

Cheers, Andreas
-- 
Andreas Dilger  \ "If a man ate a pound of pasta and a pound of antipasto,
                 \  would they cancel out, leaving him still hungry?"
http://www-mddsp.enel.ucalgary.ca/People/adilger/               -- Dogbert

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: O_TRUNC problem on a full filesystem
  2001-05-24 17:24       ` Andreas Dilger
@ 2001-05-24 18:15         ` Stephen C. Tweedie
  2001-05-24 20:26           ` Andreas Dilger
  2001-05-25  0:24         ` Andrew Morton
  1 sibling, 1 reply; 10+ messages in thread
From: Stephen C. Tweedie @ 2001-05-24 18:15 UTC (permalink / raw)
  To: Andreas Dilger
  Cc: Andrew Morton, Stephen C. Tweedie, Manas Garg, linux-kernel

Hi,

On Thu, May 24, 2001 at 11:24:10AM -0600, Andreas Dilger wrote:

> How have you done the ext3 preallocation code? 

Preallocation is currently disabled in ext3.  Eventually I'll probably
get it going by adding a journal prepare-commit callback to allow the
filesystem to flush preallocation before committing.

> One way to do it would be
> to only mark the blocks as used in the in-memory copy of the block bitmap
> and not write that to disk (we keep 2 copies of the block bitmap, IIRC).

Indeed; I'd need to keep 3 copies to make that work.  The state
machine just gets even uglier.  :-)  I thought about it and I might
still end up going that route.

> Did you ever benchmark ext2 with and without preallocation to see if it
> made any difference?  No point in doing extra work if there is no benefit.

The point is not just performance, but also cpu cost (which
preallocation definitely wins on) and on fragmentation if you have
multiple writers in the same directory.

Cheers,
 Stephen 

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: O_TRUNC problem on a full filesystem
  2001-05-24 18:15         ` Stephen C. Tweedie
@ 2001-05-24 20:26           ` Andreas Dilger
  0 siblings, 0 replies; 10+ messages in thread
From: Andreas Dilger @ 2001-05-24 20:26 UTC (permalink / raw)
  To: Stephen C. Tweedie
  Cc: Andreas Dilger, Andrew Morton, Manas Garg, linux-kernel

Stephen writes:
> On Thu, May 24, 2001 at 11:24:10AM -0600, Andreas Dilger wrote:
> > How have you done the ext3 preallocation code? 
> 
> Preallocation is currently disabled in ext3.  Eventually I'll probably
> get it going by adding a journal prepare-commit callback to allow the
> filesystem to flush preallocation before committing.

Yes, it is disabled in 2.2 ext3, but if Andrew is complaining about the
blocks not being freed in 2.4 ext3, I assume he re-enabled it somehow...

> > One way to do it would be
> > to only mark the blocks as used in the in-memory copy of the block bitmap
> > and not write that to disk (we keep 2 copies of the block bitmap, IIRC).
> 
> Indeed; I'd need to keep 3 copies to make that work.  The state
> machine just gets even uglier.  :-)  I thought about it and I might
> still end up going that route.
> 
> > Did you ever benchmark ext2 with and without preallocation to see if it
> > made any difference?  No point in doing extra work if there is no benefit.
> 
> The point is not just performance, but also cpu cost (which
> preallocation definitely wins on)

Yes, currently this is one thing that ext2/ext3 still have over XFS and
reiserfs - lower CPU usage.  If you ever see benchmarks on slower CPU
systems, ext2 does very well, and XFS does quite poorly.  The only bad
spot is the large directory handling, and I think Daniel's indexed dir
code handles that very well, because it doesn't need continual balancing
and re-packing of the directory entries.

I even realized that if you have a (formerly) huge directory, with lots
of empty blocks this even speeds up searches for non-existent entries,
sort of like a negative dentry.

> and on fragmentation if you have multiple writers in the same directory.

Yes, of course this is also hard to notice in benchmarks, but a good
feature in real life to keep file reads more within device read-ahead.

When I was thinking about the current preallocation code in conjunction
with the block goal searching code, I realized that we are normally
looking through the bitmap for at least 8 contiguous blocks, and then
back-searching for up to 7 additional contiguous blocks.  We _should_ do
block preallocation immediately at this point for up to 14 more blocks,
because we already know they are contiguous.  Something like the following
(untested) patch:

Cheers, Andreas
=========================================================================
diff -u -u -r1.4 balloc.c
--- fs/ext3/balloc.c	2001/05/21 17:00:17	1.4
+++ fs/ext3/balloc.c	2001/05/24 20:17:02
@@ -509,7 +509,11 @@
 	int bitmap_nr;
 	struct super_block * sb;
 	struct ext3_group_desc * gdp;
-	struct ext3_super_block * es;
+	struct ext3_super_block *es = EXT3_SB(sb)->s_es;
+#ifdef EXT3_PREALLOCATE
+	int prealloc_goal = es->s_prealloc_blocks ?
+		es->s_prealloc_blocks : EXT2_DEFAULT_PREALLOC_BLOCKS;
+#endif
 #ifdef EXT3FS_DEBUG
 	static int goal_hits = 0, goal_attempts = 0;
 #endif
@@ -521,7 +526,6 @@
 	}
 
 	lock_super (sb);
-	es = sb->u.ext3_sb.s_es;
 	if (le32_to_cpu(es->s_free_blocks_count) <=
 			le32_to_cpu(es->s_r_blocks_count) &&
 	    ((sb->u.ext3_sb.s_resuid != current->fsuid) &&
@@ -614,7 +618,9 @@
 		k < 7 && j > 0 && ext3_test_allocatable(j - 1, bh);
 		k++, j--)
 		;
-	
+#ifdef EXT3_PREALLOCATE
+	prealloc_goal += k;
+#endif
 got_block:
 
 	ext3_debug ("using block group %d(%d)\n", i, gdp->bg_free_blocks_count);
@@ -673,11 +679,7 @@
 	 */
 	/* Writer: ->i_prealloc* */
 	if (prealloc_count && !*prealloc_count) {
-		int	prealloc_goal;
 		unsigned long next_block = tmp + 1;
-
-		prealloc_goal = es->s_prealloc_blocks ?
-			es->s_prealloc_blocks : EXT3_DEFAULT_PREALLOC_BLOCKS;
 
 		*prealloc_block = next_block;
 		/* Writer: end */
 
-- 
Andreas Dilger  \ "If a man ate a pound of pasta and a pound of antipasto,
                 \  would they cancel out, leaving him still hungry?"
http://www-mddsp.enel.ucalgary.ca/People/adilger/               -- Dogbert

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: O_TRUNC problem on a full filesystem
  2001-05-24 17:24       ` Andreas Dilger
  2001-05-24 18:15         ` Stephen C. Tweedie
@ 2001-05-25  0:24         ` Andrew Morton
  2001-05-25  9:42           ` Stephen C. Tweedie
  1 sibling, 1 reply; 10+ messages in thread
From: Andrew Morton @ 2001-05-25  0:24 UTC (permalink / raw)
  To: Andreas Dilger; +Cc: Stephen C. Tweedie, Manas Garg, linux-kernel

Andreas Dilger wrote:
> 
> Andrew writes:
> > "Stephen C. Tweedie" wrote:
> > > On Wed, May 23, 2001 at 07:55:48PM +1000, Andrew Morton wrote:
> > > > When you truncated your file, the blocks remained preallocated
> > > > on behalf of the file, and were hence considered "used".  For
> > > > some reason, a subsequent attempt to allocate blocks for the
> > > > same file failed to use that file's preallocated blocks.
> > >
> > > Nope.  ext2_truncate() calls ext2_discard_prealloc() to fix this up.
> > > Both 2.2 and 2.4 do this correctly.
> >
> > But the problem goes away when you disable EXT2_PREALLOCATE.
> > I tested it.
> 
> Are you sure that a truncated file will re-use the same truncated blocks,
> but not the preallocated ones?  I can imagine not re-using all of the data
> blocks within a single transaction, but it would be odd if the preallocated
> blocks are treated differently.

This is vanliia ext2.  The O_TRUNC problem is easy to reproduce,
and goes away when EXT*2*_PREALLOC is undefined.  Haven't looked
into it further, but I suppose one should.  It's not nice having
unexplained mysteries in ext2.

> How have you done the ext3 preallocation code?  One way to do it would be
> to only mark the blocks as used in the in-memory copy of the block bitmap
> and not write that to disk (we keep 2 copies of the block bitmap, IIRC).
> That way you don't need to do anything fancy at recovery time.
> 
> Did you ever benchmark ext2 with and without preallocation to see if it
> made any difference?  No point in doing extra work if there is no benefit.

This is an excellent point - it would be unwise to go to the
effort and complexity of putting prealloc back into ext3
without first analysing how useful it actually is.  Perhaps
some tuning of the other anti-fragmentation algorithms
will suffice.

For example, when we miss the goal block we search forward
up to 63 blocks for a *single* free block, and use that.
Perhaps we shouldn't?

And perhaps the search for eight contiguous free blocks
is no longer appropriate to current disks.  32 may be better?

So I'd prefer to set up a simulator and at least validate the
current algorithms beforehand, perhaps tune them as well.

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: O_TRUNC problem on a full filesystem
  2001-05-25  0:24         ` Andrew Morton
@ 2001-05-25  9:42           ` Stephen C. Tweedie
  0 siblings, 0 replies; 10+ messages in thread
From: Stephen C. Tweedie @ 2001-05-25  9:42 UTC (permalink / raw)
  To: Andrew Morton
  Cc: Andreas Dilger, Stephen C. Tweedie, Manas Garg, linux-kernel

Hi,

On Fri, May 25, 2001 at 10:24:49AM +1000, Andrew Morton wrote:

> For example, when we miss the goal block we search forward
> up to 63 blocks for a *single* free block, and use that.
> Perhaps we shouldn't?

The reasoning here is that it's much cheaper to go to a single block
which is very nearby than to be forced to use that single block later
on as part of some distant file once the disk becomes fuller.  It's a
sort of opportunistic fragmentation: if we can sneak in a disk
allocation that uses the awkward block without requiring a seek (and
in all likelihood coming out of the track buffer), then we reduce the
overall impact on performance of that isolated free block.

> And perhaps the search for eight contiguous free blocks
> is no longer appropriate to current disks.  32 may be better?

I've thought about that but today we're usually allocating in 4k
chunks rather than 1k so it's normally a 32k preallocation which we
get, anyway.

Cheers,
 Stephen

^ permalink raw reply	[flat|nested] 10+ messages in thread

end of thread, other threads:[~2001-05-25  9:42 UTC | newest]

Thread overview: 10+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2001-05-23  6:13 O_TRUNC problem on a full filesystem Manas Garg
2001-05-23  9:16 ` OT: " Helge Hafting
2001-05-23  9:55 ` Andrew Morton
2001-05-24 11:16   ` Stephen C. Tweedie
2001-05-24 11:28     ` Andrew Morton
2001-05-24 17:24       ` Andreas Dilger
2001-05-24 18:15         ` Stephen C. Tweedie
2001-05-24 20:26           ` Andreas Dilger
2001-05-25  0:24         ` Andrew Morton
2001-05-25  9:42           ` Stephen C. Tweedie

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox