linux-fsdevel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* How to implement "filesystem operations tracker"?
@ 2013-09-20 16:39 Anatol Pomozov
  2013-09-20 16:44 ` Kirill A. Shutemov
  2013-09-20 18:43 ` Vyacheslav Dubeyko
  0 siblings, 2 replies; 7+ messages in thread
From: Anatol Pomozov @ 2013-09-20 16:39 UTC (permalink / raw)
  To: linux-fsdevel

Hi,

I have a following requirement: I start a process that performs a lot
of filesystem operations. And I need to know what files my process was
using - I need a breakdown by read operations and write operations.

A real-world example where such requirement needed is build-systems -
I run "gcc foo.c" and I want to know what files are dependencies of
this operation. I want to record the information and if any of
dependencies is modified - I rerun "gcc" again.

There are build systems that track dependencies by mounting by-pass
fuse filesystem and chrooting() there. e.g. tup
https://github.com/gittup/tup  But fuse is relatively slow and it
introduces additional buffer copy. I do not want to copy data to
user-space and back, all I need is to record what files were
stat()/open().

Is there a light-weight mechanism that allows to perform it?

If not then I think of a simple filesystem that implements stat/open
and bypasses all other operations to underlying filesystem
implementation. stat/open information is saved and later userspace can
receive this information e.g. via device file. Does this idea sound
sane?

^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: How to implement "filesystem operations tracker"?
  2013-09-20 16:39 How to implement "filesystem operations tracker"? Anatol Pomozov
@ 2013-09-20 16:44 ` Kirill A. Shutemov
  2013-09-20 18:00   ` Anatol Pomozov
  2013-09-20 18:43 ` Vyacheslav Dubeyko
  1 sibling, 1 reply; 7+ messages in thread
From: Kirill A. Shutemov @ 2013-09-20 16:44 UTC (permalink / raw)
  To: Anatol Pomozov; +Cc: linux-fsdevel

On Fri, Sep 20, 2013 at 09:39:48AM -0700, Anatol Pomozov wrote:
> Hi,
> 
> I have a following requirement: I start a process that performs a lot
> of filesystem operations. And I need to know what files my process was
> using - I need a breakdown by read operations and write operations.
> 
> A real-world example where such requirement needed is build-systems -
> I run "gcc foo.c" and I want to know what files are dependencies of
> this operation. I want to record the information and if any of
> dependencies is modified - I rerun "gcc" again.
> 
> There are build systems that track dependencies by mounting by-pass
> fuse filesystem and chrooting() there. e.g. tup
> https://github.com/gittup/tup  But fuse is relatively slow and it
> introduces additional buffer copy. I do not want to copy data to
> user-space and back, all I need is to record what files were
> stat()/open().
> 
> Is there a light-weight mechanism that allows to perform it?

What's wrong with strace? In altlinux it was used for ages to find
build dependencies.

See:

git://git.altlinux.org/people/ldv/packages/rpm-utils.git

in particular, buildreq and strace_files.

-- 
 Kirill A. Shutemov

^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: How to implement "filesystem operations tracker"?
  2013-09-20 16:44 ` Kirill A. Shutemov
@ 2013-09-20 18:00   ` Anatol Pomozov
  2013-09-21  0:38     ` Tetsuo Handa
  0 siblings, 1 reply; 7+ messages in thread
From: Anatol Pomozov @ 2013-09-20 18:00 UTC (permalink / raw)
  To: Kirill A. Shutemov; +Cc: linux-fsdevel

Hi

On Fri, Sep 20, 2013 at 9:44 AM, Kirill A. Shutemov
<kirill@shutemov.name> wrote:
> On Fri, Sep 20, 2013 at 09:39:48AM -0700, Anatol Pomozov wrote:
>> Hi,
>>
>> I have a following requirement: I start a process that performs a lot
>> of filesystem operations. And I need to know what files my process was
>> using - I need a breakdown by read operations and write operations.
>>
>> A real-world example where such requirement needed is build-systems -
>> I run "gcc foo.c" and I want to know what files are dependencies of
>> this operation. I want to record the information and if any of
>> dependencies is modified - I rerun "gcc" again.
>>
>> There are build systems that track dependencies by mounting by-pass
>> fuse filesystem and chrooting() there. e.g. tup
>> https://github.com/gittup/tup  But fuse is relatively slow and it
>> introduces additional buffer copy. I do not want to copy data to
>> user-space and back, all I need is to record what files were
>> stat()/open().
>>
>> Is there a light-weight mechanism that allows to perform it?
>
> What's wrong with strace? In altlinux it was used for ages to find
> build dependencies.
>
> See:
>
> git://git.altlinux.org/people/ldv/packages/rpm-utils.git
>
> in particular, buildreq and strace_files.

Tup tool tried LD_PRELOAD and ptrace approaches for tracking file
usages and both have different kind of issues. At this point they use
fuse by-pass filesystem that seems the best solution except
performance.

As of ptrace I believe it has high run-time overhead and it gets
tricky to use it in multi-thread environment. Mike Shal from
https://groups.google.com/forum/#!forum/tup-users is better source of
information about it though.

^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: How to implement "filesystem operations tracker"?
  2013-09-20 16:39 How to implement "filesystem operations tracker"? Anatol Pomozov
  2013-09-20 16:44 ` Kirill A. Shutemov
@ 2013-09-20 18:43 ` Vyacheslav Dubeyko
  2013-09-20 18:48   ` Anatol Pomozov
  1 sibling, 1 reply; 7+ messages in thread
From: Vyacheslav Dubeyko @ 2013-09-20 18:43 UTC (permalink / raw)
  To: Anatol Pomozov; +Cc: linux-fsdevel


On Sep 20, 2013, at 8:39 PM, Anatol Pomozov wrote:

> Hi,
> 
> I have a following requirement: I start a process that performs a lot
> of filesystem operations. And I need to know what files my process was
> using - I need a breakdown by read operations and write operations.
> 
> A real-world example where such requirement needed is build-systems -
> I run "gcc foo.c" and I want to know what files are dependencies of
> this operation. I want to record the information and if any of
> dependencies is modified - I rerun "gcc" again.
> 
> There are build systems that track dependencies by mounting by-pass
> fuse filesystem and chrooting() there. e.g. tup
> https://github.com/gittup/tup  But fuse is relatively slow and it
> introduces additional buffer copy. I do not want to copy data to
> user-space and back, all I need is to record what files were
> stat()/open().
> 
> Is there a light-weight mechanism that allows to perform it?
> 

What about stackable file system approach (FiST)? As far as I can see,
this approach is used by UnionFS and eCryptfs. I suppose that such approach
can be applicable for your task.

With the best regards,
Vyacheslav Dubeyko.

> If not then I think of a simple filesystem that implements stat/open
> and bypasses all other operations to underlying filesystem
> implementation. stat/open information is saved and later userspace can
> receive this information e.g. via device file. Does this idea sound
> sane?
> --
> To unsubscribe from this list: send the line "unsubscribe linux-fsdevel" in
> the body of a message to majordomo@vger.kernel.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html


^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: How to implement "filesystem operations tracker"?
  2013-09-20 18:43 ` Vyacheslav Dubeyko
@ 2013-09-20 18:48   ` Anatol Pomozov
  2013-09-20 19:15     ` Vyacheslav Dubeyko
  0 siblings, 1 reply; 7+ messages in thread
From: Anatol Pomozov @ 2013-09-20 18:48 UTC (permalink / raw)
  To: Vyacheslav Dubeyko; +Cc: linux-fsdevel

Hi

On Fri, Sep 20, 2013 at 11:43 AM, Vyacheslav Dubeyko <slava@dubeyko.com> wrote:
>
> On Sep 20, 2013, at 8:39 PM, Anatol Pomozov wrote:
>
>> Hi,
>>
>> I have a following requirement: I start a process that performs a lot
>> of filesystem operations. And I need to know what files my process was
>> using - I need a breakdown by read operations and write operations.
>>
>> A real-world example where such requirement needed is build-systems -
>> I run "gcc foo.c" and I want to know what files are dependencies of
>> this operation. I want to record the information and if any of
>> dependencies is modified - I rerun "gcc" again.
>>
>> There are build systems that track dependencies by mounting by-pass
>> fuse filesystem and chrooting() there. e.g. tup
>> https://github.com/gittup/tup  But fuse is relatively slow and it
>> introduces additional buffer copy. I do not want to copy data to
>> user-space and back, all I need is to record what files were
>> stat()/open().
>>
>> Is there a light-weight mechanism that allows to perform it?
>>
>
> What about stackable file system approach (FiST)? As far as I can see,
> this approach is used by UnionFS and eCryptfs. I suppose that such approach
> can be applicable for your task.

Yeah, unionfs looks related. I need somewhat similar but simpler.

I just found wrapfs (http://wrapfs.filesystems.org/) that seems even
closer to what I am looking for. Is wrapfs the best example of "simple
stackable fs"?

^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: How to implement "filesystem operations tracker"?
  2013-09-20 18:48   ` Anatol Pomozov
@ 2013-09-20 19:15     ` Vyacheslav Dubeyko
  0 siblings, 0 replies; 7+ messages in thread
From: Vyacheslav Dubeyko @ 2013-09-20 19:15 UTC (permalink / raw)
  To: Anatol Pomozov; +Cc: linux-fsdevel


On Sep 20, 2013, at 10:48 PM, Anatol Pomozov wrote:

> Hi
> 
> On Fri, Sep 20, 2013 at 11:43 AM, Vyacheslav Dubeyko <slava@dubeyko.com> wrote:
>> 
>> On Sep 20, 2013, at 8:39 PM, Anatol Pomozov wrote:
>> 
>>> Hi,
>>> 
>>> I have a following requirement: I start a process that performs a lot
>>> of filesystem operations. And I need to know what files my process was
>>> using - I need a breakdown by read operations and write operations.
>>> 
>>> A real-world example where such requirement needed is build-systems -
>>> I run "gcc foo.c" and I want to know what files are dependencies of
>>> this operation. I want to record the information and if any of
>>> dependencies is modified - I rerun "gcc" again.
>>> 
>>> There are build systems that track dependencies by mounting by-pass
>>> fuse filesystem and chrooting() there. e.g. tup
>>> https://github.com/gittup/tup  But fuse is relatively slow and it
>>> introduces additional buffer copy. I do not want to copy data to
>>> user-space and back, all I need is to record what files were
>>> stat()/open().
>>> 
>>> Is there a light-weight mechanism that allows to perform it?
>>> 
>> 
>> What about stackable file system approach (FiST)? As far as I can see,
>> this approach is used by UnionFS and eCryptfs. I suppose that such approach
>> can be applicable for your task.
> 
> Yeah, unionfs looks related. I need somewhat similar but simpler.
> 
> I just found wrapfs (http://wrapfs.filesystems.org/) that seems even
> closer to what I am looking for. Is wrapfs the best example of "simple
> stackable fs"?

Yes, I think so. Simplicity is better. :)

With the best regards,
Vyacheslav Dubeyko.


^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: How to implement "filesystem operations tracker"?
  2013-09-20 18:00   ` Anatol Pomozov
@ 2013-09-21  0:38     ` Tetsuo Handa
  0 siblings, 0 replies; 7+ messages in thread
From: Tetsuo Handa @ 2013-09-21  0:38 UTC (permalink / raw)
  To: anatol.pomozov, kirill; +Cc: linux-fsdevel

Anatol Pomozov wrote:
> > What's wrong with strace? In altlinux it was used for ages to find
> > build dependencies.
> 
> As of ptrace I believe it has high run-time overhead and it gets
> tricky to use it in multi-thread environment. Mike Shal from
> https://groups.google.com/forum/#!forum/tup-users is better source of
> information about it though.

Then, from kernel side but not filesystems,

  system call auditing
  SystemTap
  LSM modules like TOMOYO / AKARI / CaitSith etc.

> A real-world example where such requirement needed is build-systems -
> I run "gcc foo.c" and I want to know what files are dependencies of
> this operation. I want to record the information and if any of
> dependencies is modified - I rerun "gcc" again.

Are you trying to develop something like Makefile generator?

^ permalink raw reply	[flat|nested] 7+ messages in thread

end of thread, other threads:[~2013-09-21  0:38 UTC | newest]

Thread overview: 7+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2013-09-20 16:39 How to implement "filesystem operations tracker"? Anatol Pomozov
2013-09-20 16:44 ` Kirill A. Shutemov
2013-09-20 18:00   ` Anatol Pomozov
2013-09-21  0:38     ` Tetsuo Handa
2013-09-20 18:43 ` Vyacheslav Dubeyko
2013-09-20 18:48   ` Anatol Pomozov
2013-09-20 19:15     ` Vyacheslav Dubeyko

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).