linux-fsdevel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: Andreas Dilger <adilger@sun.com>
To: Sage Weil <sage@newdream.net>
Cc: linux-fsdevel@vger.kernel.org, linux-kernel@vger.kernel.org,
	ceph-devel@lists.sourceforge.net
Subject: Re: Recursive directory accounting for size, ctime, etc.
Date: Tue, 15 Jul 2008 13:47:06 -0600	[thread overview]
Message-ID: <20080715194706.GK6239@webber.adilger.int> (raw)
In-Reply-To: <Pine.LNX.4.64.0806192313530.7379@cobra.newdream.net>

On Jul 15, 2008  11:28 -0700, Sage Weil wrote:
> unique (?) recursive accounting 
> infrastructure that allows statistics about all metadata nested beneath a 
> point in the directory hierarchy to be efficiently propagated up the tree.  
> Currently this includes a file and directory count, total bytes (summation 
> over file sizes), and most recent inode ctime.

Interesting...

> Note that st_blocks is _not_ recursively defined, so 'du' still behaves as 
> expected.  If mounted with -o norbytes instead, the directory st_size is 
> the number of entries in the directory.

Is it possible to extract an environment variable from the process
in the kernel to decide what behaviour to have (e.g. like LS_COLORS)?

> The second interface takes advantage of the fact (?) that read() on a 
> directory is more or less undefined.  (Okay, that's not really true, but 
> it used to return encoded dirents or something similar, and more recently 
> returns -EISDIR.  As far as I know, no sane application expects meaningful 
> data from read() on a directory...)  So, assuming Ceph is mounted with -o 
> dirstat,

Hmm, what about just creating a virtual xattr that can be had with
getfattr user.dirstats?

>  - The 'rbytes' summation is over i_size, not blocks used.  That means 
> sparse files "appear" larger than the storage space they actually consume.

I'd think that in many cases it is more important to accumulate the
blocks count and not the size, since a single core file would throw
off the whole "hunt for the worst space consumer" approach.

Cheers, Andreas
--
Andreas Dilger
Sr. Staff Engineer, Lustre Group
Sun Microsystems of Canada, Inc.


  reply	other threads:[~2008-07-15 19:47 UTC|newest]

Thread overview: 14+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2008-07-15 18:28 Recursive directory accounting for size, ctime, etc Sage Weil
2008-07-15 19:47 ` Andreas Dilger [this message]
2008-07-15 20:26   ` Sage Weil
2008-07-15 19:53 ` J. Bruce Fields
2008-07-15 20:41   ` Sage Weil
2008-07-15 20:48     ` J. Bruce Fields
2008-07-15 21:16       ` Sage Weil
2008-07-15 22:45         ` J. Bruce Fields
2008-07-15 21:44       ` Jamie Lokier
2008-07-15 21:51         ` Sage Weil
2008-07-15 21:56     ` Jamie Lokier
2008-08-05 18:26 ` Pavel Machek
2008-08-08 13:11   ` John Stoffel
2008-08-08 23:32     ` Sage Weil

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20080715194706.GK6239@webber.adilger.int \
    --to=adilger@sun.com \
    --cc=ceph-devel@lists.sourceforge.net \
    --cc=linux-fsdevel@vger.kernel.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=sage@newdream.net \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).