From: "John Stoffel" <john@stoffel.org>
To: Pavel Machek <pavel@suse.cz>
Cc: Sage Weil <sage@newdream.net>,
linux-fsdevel@vger.kernel.org, linux-kernel@vger.kernel.org,
ceph-devel@lists.sourceforge.net
Subject: Re: Recursive directory accounting for size, ctime, etc.
Date: Fri, 8 Aug 2008 09:11:00 -0400 [thread overview]
Message-ID: <18588.17892.1819.479800@stoffel.org> (raw)
In-Reply-To: <20080805182610.GD8380@ucw.cz>
>>>>> "Pavel" == Pavel Machek <pavel@suse.cz> writes:
Pavel> On Tue 2008-07-15 11:28:22, Sage Weil wrote:
>> All-
>>
>> Ceph is a new distributed file system for Linux designed for scalability
>> (terabytes to exabytes, tens to thousands of storage nodes), reliability,
>> and performance. The latest release (v0.3), aside from xattr support and
>> the usual slew of bugfixes, includes a unique (?) recursive accounting
>> infrastructure that allows statistics about all metadata nested beneath a
>> point in the directory hierarchy to be efficiently propagated up the tree.
>> Currently this includes a file and directory count, total bytes (summation
>> over file sizes), and most recent inode ctime. For example, for a
>> directory like /home, Ceph can efficiently report the total number of
>> files, directories, and bytes contained by that entire subtree of the
>> directory hierarchy.
>>
>> The file size summation is the most interesting, as it effectively gives
>> you directory-based quota space accounting with fine granularity. In many
>> deployments, the quota _accounting_ is more important than actual
>> enforcement. Anybody who has had to figure out what has filled/is filling
>> up a large volume will appreciate how cumbersome and inefficient 'du' can
>> be for that purpose--especially when you're in a hurry.
>>
>> There are currently two ways to access the recursive stats via a standard
>> shell. The first simply sets the directory st_size value to the
>> _recursive_ bytes ('rbytes') value (when the client is mounted with -o
>> rbytes). For example (watch the directory sizes),
Pavel> ...
>> Naturally, there are a few caveats:
>>
>> - There is some built-in delay before statistics fully propagate up
>> toward the root of the hierarchy. Changes are propagated
>> opportunistically when lock/lease state allows, with an upper bound of (by
>> default) ~30 seconds for each level of directory nesting.
Pavel> Having instant rctime would be very nice -- for stuff like locate and
Pavel> speeding up kde startup.
>> I'm extremely interested in what people think of overloading the file
>> system interface in this way. Handy? Crufty? Dangerous? Does anybody
Pavel> Too ugly to live.
Pavel> What about new rstat() syscall?
Or how about tying this into the quotactl() syscall and extending it a
bit? Say quotactl2(cmd,device,id,addr,path) which is probably just as
ugly, but seems to make better sense.
Me, I'd love to have this type of reporting on my filesystems, esp
since it would help me in my day job.
How exports over NFS would look is an issue too.
John
next prev parent reply other threads:[~2008-08-08 13:11 UTC|newest]
Thread overview: 14+ messages / expand[flat|nested] mbox.gz Atom feed top
2008-07-15 18:28 Recursive directory accounting for size, ctime, etc Sage Weil
2008-07-15 19:47 ` Andreas Dilger
2008-07-15 20:26 ` Sage Weil
2008-07-15 19:53 ` J. Bruce Fields
2008-07-15 20:41 ` Sage Weil
2008-07-15 20:48 ` J. Bruce Fields
2008-07-15 21:16 ` Sage Weil
2008-07-15 22:45 ` J. Bruce Fields
2008-07-15 21:44 ` Jamie Lokier
2008-07-15 21:51 ` Sage Weil
2008-07-15 21:56 ` Jamie Lokier
2008-08-05 18:26 ` Pavel Machek
2008-08-08 13:11 ` John Stoffel [this message]
2008-08-08 23:32 ` Sage Weil
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=18588.17892.1819.479800@stoffel.org \
--to=john@stoffel.org \
--cc=ceph-devel@lists.sourceforge.net \
--cc=linux-fsdevel@vger.kernel.org \
--cc=linux-kernel@vger.kernel.org \
--cc=pavel@suse.cz \
--cc=sage@newdream.net \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox