From: "J. Bruce Fields" <bfields@fieldses.org>
To: Sage Weil <sage@newdream.net>
Cc: linux-fsdevel@vger.kernel.org, linux-kernel@vger.kernel.org,
ceph-devel@lists.sourceforge.net
Subject: Re: Recursive directory accounting for size, ctime, etc.
Date: Tue, 15 Jul 2008 15:53:33 -0400 [thread overview]
Message-ID: <20080715195333.GK21590@fieldses.org> (raw)
In-Reply-To: <Pine.LNX.4.64.0806192313530.7379@cobra.newdream.net>
On Tue, Jul 15, 2008 at 11:28:22AM -0700, Sage Weil wrote:
> Fields prefixed with 'r' are recursively defined, while
> entries/files/subdirs is just for the one directory. 'rctime' is the most
> recent ctime within the hierarchy, which should be useful for backup
> software or anything else scanning the hierarchy for recent changes.
>
> Naturally, there are a few caveats:
>
> - There is some built-in delay before statistics fully propagate up
> toward the root of the hierarchy. Changes are propagated
> opportunistically when lock/lease state allows, with an upper bound of (by
> default) ~30 seconds for each level of directory nesting.
That makes it less useful, e.g., for somebody with cached data trying to
validate their cache, or for something like git trying to check a
directory tree for changes.
> - Ceph internally distinguishes between multiple links to the same file
> (there is a single 'primary' link, and then zero or more 'remote' links).
> Only the primary link contributes toward the 'rbytes' total.
Is that only true for 'rbytes'?
--b.
>
> - The 'rbytes' summation is over i_size, not blocks used. That means
> sparse files "appear" larger than the storage space they actually consume.
>
> - Directories don't yet contribute anything to the 'rbytes' total. They
> should probably include an estimate of the storage consumed by directory
> metadata. For this reason, and because the size isn't rounded up to the
> block size, the 'rbytes' total will usually be slightly smaller than what
> you get from 'du'.
>
> - Currently no stats for the root directory itself.
>
>
> I'm extremely interested in what people think of overloading the file
> system interface in this way. Handy? Crufty? Dangerous? Does anybody
> know of any applications that rely on or expect meaningful values for a
> directory's i_size? Or read() a directory?
>
>
> More information on the recursive accounting at
>
> http://ceph.newdream.net/wiki/Recursive_accounting
>
> and Ceph itself at
>
> http://ceph.newdream.net/
>
> Cheers-
> sage
> --
> To unsubscribe from this list: send the line "unsubscribe linux-fsdevel" in
> the body of a message to majordomo@vger.kernel.org
> More majordomo info at http://vger.kernel.org/majordomo-info.html
next prev parent reply other threads:[~2008-07-15 19:53 UTC|newest]
Thread overview: 14+ messages / expand[flat|nested] mbox.gz Atom feed top
2008-07-15 18:28 Recursive directory accounting for size, ctime, etc Sage Weil
2008-07-15 19:47 ` Andreas Dilger
2008-07-15 20:26 ` Sage Weil
2008-07-15 19:53 ` J. Bruce Fields [this message]
2008-07-15 20:41 ` Sage Weil
2008-07-15 20:48 ` J. Bruce Fields
2008-07-15 21:16 ` Sage Weil
2008-07-15 22:45 ` J. Bruce Fields
2008-07-15 21:44 ` Jamie Lokier
2008-07-15 21:51 ` Sage Weil
2008-07-15 21:56 ` Jamie Lokier
2008-08-05 18:26 ` Pavel Machek
2008-08-08 13:11 ` John Stoffel
2008-08-08 23:32 ` Sage Weil
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20080715195333.GK21590@fieldses.org \
--to=bfields@fieldses.org \
--cc=ceph-devel@lists.sourceforge.net \
--cc=linux-fsdevel@vger.kernel.org \
--cc=linux-kernel@vger.kernel.org \
--cc=sage@newdream.net \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox