From: "Jim Schutt" <jaschut@sandia.gov>
To: Sage Weil <sage@inktank.com>
Cc: Wido den Hollander <wido@42on.com>,
Greg Farnum <greg@inktank.com>,
ceph-devel@vger.kernel.org
Subject: Re: CephFS First product release discussion
Date: Wed, 6 Mar 2013 12:07:41 -0700 [thread overview]
Message-ID: <513793FD.7010001@sandia.gov> (raw)
In-Reply-To: <alpine.DEB.2.00.1303051131010.29462@cobra.newdream.net>
On 03/05/2013 12:33 PM, Sage Weil wrote:
>> > Running 'du' on each directory would be much faster with Ceph since it
>> > accounts tracks the subdirectories and shows their total size with an 'ls
>> > -al'.
>> >
>> > Environments with 100k users also tend to be very dynamic with adding and
>> > removing users all the time, so creating separate filesystems for them would
>> > be very time consuming.
>> >
>> > Now, I'm not talking about enforcing soft or hard quotas, I'm just talking
>> > about knowing how much space uid X and Y consume on the filesystem.
> The part I'm most unclear on is what use cases people have where uid X and
> Y are spread around the file system (not in a single or small set of sub
> directories) and per-user (not, say, per-project) quotas are still
> necessary. In most environments, users get their own home directory and
> everything lives there...
Hmmm, is there a tool I should be using that will return the space
used by a directory, and all its descendants?
If it's 'du', that tool is definitely not fast for me.
I'm doing an 'strace du -s <path>', where <path> has one
subdirectory which contains ~600 files. I've got ~200 clients
mounting the file system, and each client wrote 3 files in that
directory.
I'm doing the 'du' from one of those nodes, and the strace is showing
me du is doing a 'newfstat' for each file. For each file that was
written on a different client from where du is running, that 'newfstat'
takes tens of seconds to return. Which means my 'du' has been running
for quite some time and hasn't finished yet....
I'm hoping there's another tool I'm supposed to be using that I
don't know about yet. Our use case includes tens of millions
of files written from thousands of clients, and whatever tool
we use to do space accounting needs to not walk an entire directory
tree, checking each file.
-- Jim
>
> sage
>
>
>> >
>> > Wido
next prev parent reply other threads:[~2013-03-06 19:08 UTC|newest]
Thread overview: 31+ messages / expand[flat|nested] mbox.gz Atom feed top
[not found] <sfid-H20130305-170326-+024.05-1@marduk.tchpc.tcd.ie>
2013-03-05 17:03 ` CephFS First product release discussion Greg Farnum
2013-03-05 18:08 ` Wido den Hollander
2013-03-05 18:17 ` Greg Farnum
2013-03-05 18:28 ` Sage Weil
2013-03-05 18:36 ` Wido den Hollander
2013-03-05 18:48 ` Jim Schutt
2013-03-05 19:33 ` Sage Weil
2013-03-06 17:24 ` Wido den Hollander
2013-03-06 19:07 ` Jim Schutt [this message]
2013-03-06 19:13 ` CephFS Space Accounting and Quotas (was: CephFS First product release discussion) Greg Farnum
2013-03-06 19:58 ` CephFS Space Accounting and Quotas Jim Schutt
2013-03-06 20:21 ` Greg Farnum
2013-03-06 21:28 ` Jim Schutt
2013-03-06 21:39 ` Greg Farnum
2013-03-06 23:14 ` Jim Schutt
2013-03-07 0:18 ` Greg Farnum
2013-03-07 15:15 ` Jim Schutt
2013-03-08 22:45 ` Jim Schutt
2013-03-09 2:05 ` Greg Farnum
2013-03-11 14:47 ` Jim Schutt
2013-03-11 15:48 ` Greg Farnum
2013-03-11 16:48 ` Jim Schutt
2013-03-11 16:57 ` Greg Farnum
2013-03-11 20:40 ` Jim Schutt
2013-03-12 22:34 ` Jim Schutt
[not found] ` <513FAE0F.2010608@sandia.gov>
[not found] ` <BE627BF4B6E74BD49037D07821FC1DB9@inktank.com>
[not found] ` <5143AA84.50409@sandia.gov>
2013-03-15 23:17 ` Greg Farnum
2013-03-18 14:19 ` Jim Schutt
2013-03-06 21:42 ` Sage Weil
2013-03-06 5:01 ` [ceph-users] CephFS First product release discussion Neil Levine
[not found] ` <CANygib-U_MQi1TMmQuT_Q9MVwPfT+PzJwN=+BMcBK69WuRfu3w-JsoAwUIsXosN+BqQ9rBEUg@public.gmane.org>
2013-03-07 13:11 ` Félix Ortega Hortigüela
[not found] ` <E0B1337A572647BA9FCC0CE8CA946F42-4GqslpFJ+cxBDgjK7y7TUQ@public.gmane.org>
2013-03-07 11:54 ` Jimmy Tang
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=513793FD.7010001@sandia.gov \
--to=jaschut@sandia.gov \
--cc=ceph-devel@vger.kernel.org \
--cc=greg@inktank.com \
--cc=sage@inktank.com \
--cc=wido@42on.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.