public inbox for linux-xfs@vger.kernel.org
 help / color / mirror / Atom feed
From: Matthias Schniedermeyer <ms@citd.de>
To: Dave Chinner <david@fromorbit.com>
Cc: xfs@oss.sgi.com
Subject: Re: Files appear too big in `du`
Date: Tue, 10 May 2011 17:33:00 +0200	[thread overview]
Message-ID: <20110510153300.GA5764@citd.de> (raw)
In-Reply-To: <20110510131705.GE19446@dastard>

On 10.05.2011 23:17, Dave Chinner wrote:
> On Tue, May 10, 2011 at 12:57:00PM +0200, Matthias Schniedermeyer wrote:
> > Hi
> > 
> > 
> > Since a few weeks i'm experiencing an annoying 'thing' where files are 
> > often too big in `du` and directory totals are to high in `ls -l`.
> > 
> > I appears that files, which are in the process of beeing 
> > copied/downloaded/whatever, grow in large chunks ahead of time, while 
> > the actual file-content is beeing copied into the files.
> 
> It's supposed to work like this. It's called speculative allocation
> beyond end of file. XFS has always done this, but we've recently
> made it more aggressive to prevent excessive fragmentation on
> concurrent large file workloads when there is lots of disk space
> free.

OK.

> > And then it 
> > appears that the last chunk isn't shrunk after the process is finished.
> 
> It should be truncated away when the file descriptor is closed and
> the last reference goes away.
>
> > Neither xfs_bmap (Version 3.1.5) nor filefrag show anything beyond the 
> > extent that compromises the actual file-content.
> 
> what is the output of xfs_bmap -vvp on a file that apparently hasn't
> been shrunk? How do you know it hasn't been shrunk? Does it persist

du

> forever in this state, or does doing something like dropping caches
> (echo 3 > /proc/sys/vm/drop_caches) cause the specualtive
> preallocation to disappear?

This works:
sync ; echo 3 > /proc/sys/vm/drop_caches

At least in several tries the `du` output shrunk to the size of the 
original.

> > Any idea how to debug this, or is this a known bug and waiting a few 
> > days for 2.6.39 should fix this?
> 
> It doesn't appear to be doing anything wrong from your description.
> Remember that XFS is optimised for high end storage and server
> configurations and workloads, not typical desktop usage...

I would call it a regression.
I reguarly follow copying/downloading with `du`, the speculative
preallocation makes that more or less useless. Especially downloading 
someting big from the internet which @ 231kb/s isn't exactly fast and 
shows identical `du`s for increasingly longer periods of time.
(Or "--apparent-size" should be made default, but that falls short with 
sparse-files)

IMHO `du`/`ls -l` should not be able to 'see' the speculative 
preallocation.




Bis denn

-- 
Real Programmers consider "what you see is what you get" to be just as 
bad a concept in Text Editors as it is in women. No, the Real Programmer
wants a "you asked for it, you got it" text editor -- complicated, 
cryptic, powerful, unforgiving, dangerous.

_______________________________________________
xfs mailing list
xfs@oss.sgi.com
http://oss.sgi.com/mailman/listinfo/xfs

  reply	other threads:[~2011-05-10 15:33 UTC|newest]

Thread overview: 5+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2011-05-10 10:57 Files appear too big in `du` Matthias Schniedermeyer
2011-05-10 13:17 ` Dave Chinner
2011-05-10 15:33   ` Matthias Schniedermeyer [this message]
2011-05-12 10:01     ` Matthias Schniedermeyer
2011-05-17  8:38       ` Dave Chinner

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20110510153300.GA5764@citd.de \
    --to=ms@citd.de \
    --cc=david@fromorbit.com \
    --cc=xfs@oss.sgi.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox