public inbox for linux-audit@redhat.com
 help / color / mirror / Atom feed
From: Rudi Chiarito <nutello@sweetness.com>
To: linux-audit@redhat.com
Subject: Filesystem access statistics
Date: Wed, 12 Apr 2006 17:23:25 +0200	[thread overview]
Message-ID: <20060412152325.GA1399@plain.rackshack.net> (raw)

Hi,
I subscribed to the list after checking with Steve that it was not an
outlandish choice of places where to ask my questions.

I need to look at a portion of the filesystem namespace and maintain
aggregate statistics on access patterns. In other words, I have a large
filesystem and would like to find out which are the hot spots. I don't
need to keep track of every single file access: since the file count is
in the order of millions, that would swamp the actual I/O, the
analysis and the people looking at the final data. It would make sense
to just group accesses by looking at the top N levels (anything
accessed at levels N+1, N+2, etc. would be coalesced into the parent
directory at level N).

I think that I can't be the only one with such a need. In my case, the
information is going to be used to change the way the tree is going to
be laid out in the future, as well as determining when parts of it can
be made read-only (after an inactivity period). I can also see the
information being useful for selective incremental backups - just look
at the hot spots - or for smarter ordering during a disaster recovery
restore (if you're recovering from random access storage, not tape).
Maybe even locate/slocate/rlocate/mlocate could take advantage of it.

What would be the best approach to this? Inotify doesn't seem to cut it,
because it can't handle recursive watches. I can't afford placing
watches all over the place. Given the sheer number of operations being
tracked, it looks like I'd need some custom code that audits all
file/directory operations, determines if there's a match (I'm only
interested in a specific tree, not everything under /), increments
internal counters and throws the event away. Is there code I could look
at for ideas?

Thanks in advance for any help.

-- 
Rudi

             reply	other threads:[~2006-04-12 15:26 UTC|newest]

Thread overview: 4+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2006-04-12 15:23 Rudi Chiarito [this message]
2006-04-12 16:26 ` Filesystem access statistics Steve Grubb
2006-04-12 20:12   ` Rudi Chiarito
2006-04-13 13:50     ` Steve Grubb

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20060412152325.GA1399@plain.rackshack.net \
    --to=nutello@sweetness.com \
    --cc=linux-audit@redhat.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox