* which dentry a page belongs to
@ 2004-04-23 14:57 Shaya Potter
2004-04-23 15:14 ` Jamie Lokier
0 siblings, 1 reply; 22+ messages in thread
From: Shaya Potter @ 2004-04-23 14:57 UTC (permalink / raw)
To: linux-fsdevel
I'm trying to do something funky w/ dentry's in a filesystem's
writepage() function.
I know it's easy to figure out which inode a page belongs to as the
address_space structure is tied to it which points to the inode.
page->mapping->host
and it seems one should be able to figure out which vm_area_struct that
the page belongs to, and from there figure out the correct dentry, but
I'm unsure if this is easy or should work. It would seem that since the
address_space object contains the vm_area_struct's of i_mmap and
i_mmap_shared I should then be able to get the appropriate file and
dentry object's through
page->mapping->i_mmap->vm_file->f_dentry
or
page->mapping->i_mmap_shared->vm_file->f_dentry
1) Is this correct logic? I'm assuming the only things that matters in
choosing which list is used if the page is map'd shared or not? is that
correct as well?
thanks,
shaya
^ permalink raw reply [flat|nested] 22+ messages in thread
* Re: which dentry a page belongs to
2004-04-23 14:57 which dentry a page belongs to Shaya Potter
@ 2004-04-23 15:14 ` Jamie Lokier
2004-04-23 15:42 ` Shaya Potter
0 siblings, 1 reply; 22+ messages in thread
From: Jamie Lokier @ 2004-04-23 15:14 UTC (permalink / raw)
To: Shaya Potter; +Cc: linux-fsdevel
Shaya Potter wrote:
> It would seem that since the
> address_space object contains the vm_area_struct's of i_mmap and
> i_mmap_shared I should then be able to get the appropriate file and
> dentry object's through
>
> page->mapping->i_mmap->vm_file->f_dentry
>
> or
>
> page->mapping->i_mmap_shared->vm_file->f_dentry
>
> 1) Is this correct logic? I'm assuming the only things that matters in
> choosing which list is used if the page is map'd shared or not? is that
> correct as well?
No, no and no.
i_mmap and i_mmap_shared are lists. They can both be empty, or both
non-empty. A page can be mapped shared *and* non-shared at the same
time. A page might not be mapped at all.
Also, a page is often mapped in a _subset_ of the mappings which are
found in i_mmap and i_mmap_shared: it depends on its offset, and the
vma offsets, and non-linear mapping offsets.
It is possible to find multiple dentries which are currently being
used to map a page.
It's also possible to find no dentries at all.
Your question is extremely ill-formed. What do you mean by "the
dentry corresponding to a page"? What do you want the value for?
-- Jamie
^ permalink raw reply [flat|nested] 22+ messages in thread
* Re: which dentry a page belongs to
2004-04-23 15:14 ` Jamie Lokier
@ 2004-04-23 15:42 ` Shaya Potter
2004-04-23 16:37 ` Christoph Hellwig
0 siblings, 1 reply; 22+ messages in thread
From: Shaya Potter @ 2004-04-23 15:42 UTC (permalink / raw)
To: Jamie Lokier; +Cc: linux-fsdevel
On Fri, 2004-04-23 at 16:14 +0100, Jamie Lokier wrote:
> Shaya Potter wrote:
> > It would seem that since the
> > address_space object contains the vm_area_struct's of i_mmap and
> > i_mmap_shared I should then be able to get the appropriate file and
> > dentry object's through
> >
> > page->mapping->i_mmap->vm_file->f_dentry
> >
> > or
> >
> > page->mapping->i_mmap_shared->vm_file->f_dentry
> >
> > 1) Is this correct logic? I'm assuming the only things that matters in
> > choosing which list is used if the page is map'd shared or not? is that
> > correct as well?
>
> No, no and no.
>
> i_mmap and i_mmap_shared are lists. They can both be empty, or both
> non-empty. A page can be mapped shared *and* non-shared at the same
> time. A page might not be mapped at all.
yes, they can be empty for "generic" pages, but I'm looking at a
specific case of file system pages, so they shouldn't be empty. i.e.
otherwise my fs's writepage() shouldn't be called, I would think.
> Also, a page is often mapped in a _subset_ of the mappings which are
> found in i_mmap and i_mmap_shared: it depends on its offset, and the
> vma offsets, and non-linear mapping offsets.
ok, this I don't understand. need to look into this, so it's not as
simple a dereference as I thought.
> It is possible to find multiple dentries which are currently being
> used to map a page.
a single page can have multiple dentries? but it has only one inode?
(i.e. host) So I can imagine if the single inode is linked in multiple
places (for my purposes I don't care about that directly) but can it
really have multiple inodes?
> It's also possible to find no dentries at all.
even if in my fs's writepage() function?
> Your question is extremely ill-formed. What do you mean by "the
> dentry corresponding to a page"? What do you want the value for?
When my writepage() is called, I want to be able to possibly do dentry
based operations (rename, d_path....) to be told what files are actually
getting written to via writepage() (as opposed to the file system's
write() functionality).
i.e. there are 2 ways for a file to be written to a file system (at
least as far as I understand, could easily be wrong) writepage() and
write().
in write() one know's the dentry (file->f_dentry)
I'm trying to figure out how I can get the same knowledge for
writepage() from the page passed to it.
^ permalink raw reply [flat|nested] 22+ messages in thread
* Re: which dentry a page belongs to
2004-04-23 15:42 ` Shaya Potter
@ 2004-04-23 16:37 ` Christoph Hellwig
2004-04-23 16:52 ` Shaya Potter
2004-04-24 8:44 ` Jan Hudec
0 siblings, 2 replies; 22+ messages in thread
From: Christoph Hellwig @ 2004-04-23 16:37 UTC (permalink / raw)
To: Shaya Potter; +Cc: Jamie Lokier, linux-fsdevel
On Fri, Apr 23, 2004 at 11:42:19AM -0400, Shaya Potter wrote:
> > i_mmap and i_mmap_shared are lists. They can both be empty, or both
> > non-empty. A page can be mapped shared *and* non-shared at the same
> > time. A page might not be mapped at all.
>
> yes, they can be empty for "generic" pages, but I'm looking at a
> specific case of file system pages, so they shouldn't be empty. i.e.
> otherwise my fs's writepage() shouldn't be called, I would think.
in 2.4 writepage is always the result of data dirtied by mmap. In 2.6 it's
also for use for data dirtied by write. Even in 2.4 there's no gurantee
the mapping that dirtied the page still exists when the page is written out
by the VM.
> > It is possible to find multiple dentries which are currently being
> > used to map a page.
>
> a single page can have multiple dentries? but it has only one inode?
Yes.
> (i.e. host) So I can imagine if the single inode is linked in multiple
> places (for my purposes I don't care about that directly) but can it
> really have multiple inodes?
It can't have multiple inodes.
> > It's also possible to find no dentries at all.
>
> even if in my fs's writepage() function?
Yes.
> > Your question is extremely ill-formed. What do you mean by "the
> > dentry corresponding to a page"? What do you want the value for?
>
> When my writepage() is called, I want to be able to possibly do dentry
> based operations (rename, d_path....) to be told what files are actually
> getting written to via writepage() (as opposed to the file system's
> write() functionality).
You can't do that.
^ permalink raw reply [flat|nested] 22+ messages in thread
* Re: which dentry a page belongs to
2004-04-23 16:37 ` Christoph Hellwig
@ 2004-04-23 16:52 ` Shaya Potter
2004-04-23 17:01 ` Christoph Hellwig
2004-04-24 8:44 ` Jan Hudec
1 sibling, 1 reply; 22+ messages in thread
From: Shaya Potter @ 2004-04-23 16:52 UTC (permalink / raw)
To: Christoph Hellwig; +Cc: Jamie Lokier, linux-fsdevel
On Fri, 2004-04-23 at 17:37 +0100, Christoph Hellwig wrote:
> On Fri, Apr 23, 2004 at 11:42:19AM -0400, Shaya Potter wrote:
> > > i_mmap and i_mmap_shared are lists. They can both be empty, or both
> > > non-empty. A page can be mapped shared *and* non-shared at the same
> > > time. A page might not be mapped at all.
> >
> > yes, they can be empty for "generic" pages, but I'm looking at a
> > specific case of file system pages, so they shouldn't be empty. i.e.
> > otherwise my fs's writepage() shouldn't be called, I would think.
>
> in 2.4 writepage is always the result of data dirtied by mmap. In 2.6 it's
> also for use for data dirtied by write. Even in 2.4 there's no gurantee
> the mapping that dirtied the page still exists when the page is written out
> by the VM.
so the mapping off the page struct will be null?
> > > It is possible to find multiple dentries which are currently being
> > > used to map a page.
> >
> > a single page can have multiple dentries? but it has only one inode?
>
> Yes.
>
> > (i.e. host) So I can imagine if the single inode is linked in multiple
> > places (for my purposes I don't care about that directly) but can it
> > really have multiple inodes?
>
> It can't have multiple inodes.
so is that a yes to my understanding of "multiple dentries" i.e. a
single inode linked into multiple places in the fs.
>
> > > It's also possible to find no dentries at all.
> >
> > even if in my fs's writepage() function?
>
> Yes.
so the vm_file part of vm_area_struct will be null?
> > > Your question is extremely ill-formed. What do you mean by "the
> > > dentry corresponding to a page"? What do you want the value for?
> >
> > When my writepage() is called, I want to be able to possibly do dentry
> > based operations (rename, d_path....) to be told what files are actually
> > getting written to via writepage() (as opposed to the file system's
> > write() functionality).
>
> You can't do that.
Yes, I know it's evil to do and would never be accepted into the kernel
proper, the question I'm trying to figure out if it it's possible (with
a stackable fs) to version files on write.
via the write() interface it's easy (rename underlying dentry to new
name, create new dentry w/ old name, copy data from new name to old
name, with an underlying fs that supports a cow semantic can be pretty
quick, stacked fs also manages upper->lower name mappings) ,
via writepage() interface is what I'm trying to solve, without forcing a
open/close (as then could version on open, very easily), so that one
could essentialy ioctl() the fs, and all future initial writes cause a
version. Without a stackale fs, but with a fs that supports versioning
this is easy, as just chain everything off the inode, trying to figure
out how to do this with a stackable fs.
^ permalink raw reply [flat|nested] 22+ messages in thread
* Re: which dentry a page belongs to
2004-04-23 16:52 ` Shaya Potter
@ 2004-04-23 17:01 ` Christoph Hellwig
2004-04-23 17:18 ` Shaya Potter
2004-04-24 8:53 ` Jan Hudec
0 siblings, 2 replies; 22+ messages in thread
From: Christoph Hellwig @ 2004-04-23 17:01 UTC (permalink / raw)
To: Shaya Potter; +Cc: Jamie Lokier, linux-fsdevel
On Fri, Apr 23, 2004 at 12:52:54PM -0400, Shaya Potter wrote:
> > in 2.4 writepage is always the result of data dirtied by mmap. In 2.6 it's
> > also for use for data dirtied by write. Even in 2.4 there's no gurantee
> > the mapping that dirtied the page still exists when the page is written out
> > by the VM.
>
> so the mapping off the page struct will be null?
No. but i_mmap and i_mmap_shared might be empty.
> so is that a yes to my understanding of "multiple dentries" i.e. a
> single inode linked into multiple places in the fs.
Yes.
> > > > It's also possible to find no dentries at all.
> > >
> > > even if in my fs's writepage() function?
> >
> > Yes.
>
> so the vm_file part of vm_area_struct will be null?
You won't find a vm_area_struct at all in that case.
> > You can't do that.
>
> Yes, I know it's evil to do and would never be accepted into the kernel
> proper, the question I'm trying to figure out if it it's possible (with
> a stackable fs) to version files on write.
It's not evil is's fucking impossible for gods sake.
you can be in writepage with page->mapping->i_mmap{,shared} beeing empty.
No way in hell you'll ever get to a dentry.
It's not that difficult. And if english language is so difficult to
understand ask someone to draw a diagram for you.
^ permalink raw reply [flat|nested] 22+ messages in thread
* Re: which dentry a page belongs to
2004-04-23 17:01 ` Christoph Hellwig
@ 2004-04-23 17:18 ` Shaya Potter
2004-04-23 17:22 ` Christoph Hellwig
2004-04-23 17:37 ` Jamie Lokier
2004-04-24 8:53 ` Jan Hudec
1 sibling, 2 replies; 22+ messages in thread
From: Shaya Potter @ 2004-04-23 17:18 UTC (permalink / raw)
To: Christoph Hellwig; +Cc: Jamie Lokier, linux-fsdevel
On Fri, 2004-04-23 at 18:01 +0100, Christoph Hellwig wrote:
> It's not evil is's fucking impossible for gods sake.
possibly, just want to be able to prove it to myself, even though your
word is pretty good.
> you can be in writepage with page->mapping->i_mmap{,shared} beeing empty.
> No way in hell you'll ever get to a dentry.
the question being in what cases will that happen, so I can make a
determination if I care about those cases. (i.e. if the dentry is
deleted, I don't particularly care, as since I am versioning, if it's
already been deleted, don't care)
i.e. how can I determine where i_mmap{,shared} exist and when does't it?
(in file system page context).
thanks,
shaya
^ permalink raw reply [flat|nested] 22+ messages in thread
* Re: which dentry a page belongs to
2004-04-23 17:18 ` Shaya Potter
@ 2004-04-23 17:22 ` Christoph Hellwig
2004-04-23 17:32 ` Shaya Potter
2004-04-23 17:37 ` Jamie Lokier
1 sibling, 1 reply; 22+ messages in thread
From: Christoph Hellwig @ 2004-04-23 17:22 UTC (permalink / raw)
To: Shaya Potter; +Cc: Jamie Lokier, linux-fsdevel
On Fri, Apr 23, 2004 at 01:18:27PM -0400, Shaya Potter wrote:
> > you can be in writepage with page->mapping->i_mmap{,shared} beeing empty.
> > No way in hell you'll ever get to a dentry.
>
> the question being in what cases will that happen, so I can make a
> determination if I care about those cases. (i.e. if the dentry is
> deleted, I don't particularly care, as since I am versioning, if it's
> already been deleted, don't care)
>
> i.e. how can I determine where i_mmap{,shared} exist and when does't it?
> (in file system page context).
if there's no mapping anymore at the point of the writeback. e.g. when
an munmap happened before the writeback is scheduled. And you can't
determine it except by checking whether it's empty.
p.s. wondering what strange homework assignments they give these days..
^ permalink raw reply [flat|nested] 22+ messages in thread
* Re: which dentry a page belongs to
2004-04-23 17:22 ` Christoph Hellwig
@ 2004-04-23 17:32 ` Shaya Potter
0 siblings, 0 replies; 22+ messages in thread
From: Shaya Potter @ 2004-04-23 17:32 UTC (permalink / raw)
To: Christoph Hellwig; +Cc: Jamie Lokier, linux-fsdevel
On Fri, 2004-04-23 at 18:22 +0100, Christoph Hellwig wrote:
> On Fri, Apr 23, 2004 at 01:18:27PM -0400, Shaya Potter wrote:
> > > you can be in writepage with page->mapping->i_mmap{,shared} beeing empty.
> > > No way in hell you'll ever get to a dentry.
> >
> > the question being in what cases will that happen, so I can make a
> > determination if I care about those cases. (i.e. if the dentry is
> > deleted, I don't particularly care, as since I am versioning, if it's
> > already been deleted, don't care)
> >
> > i.e. how can I determine where i_mmap{,shared} exist and when does't it?
> > (in file system page context).
>
> if there's no mapping anymore at the point of the writeback. e.g. when
> an munmap happened before the writeback is scheduled. And you can't
> determine it except by checking whether it's empty.
so that could be very repeatable if a process does a write to memory and
then quits. hmm.
> p.s. wondering what strange homework assignments they give these days..
not homework, research project. and yes, it is strange. :) I'm
finished w/ my course work.
^ permalink raw reply [flat|nested] 22+ messages in thread
* Re: which dentry a page belongs to
2004-04-23 17:18 ` Shaya Potter
2004-04-23 17:22 ` Christoph Hellwig
@ 2004-04-23 17:37 ` Jamie Lokier
2004-04-23 17:59 ` Shaya Potter
2004-04-23 18:05 ` Shaya Potter
1 sibling, 2 replies; 22+ messages in thread
From: Jamie Lokier @ 2004-04-23 17:37 UTC (permalink / raw)
To: Shaya Potter; +Cc: Christoph Hellwig, linux-fsdevel
Shaya Potter wrote:
> > you can be in writepage with page->mapping->i_mmap{,shared} beeing empty.
> > No way in hell you'll ever get to a dentry.
>
> the question being in what cases will that happen, so I can make a
> determination if I care about those cases. (i.e. if the dentry is
> deleted, I don't particularly care, as since I am versioning, if it's
> already been deleted, don't care)
>
> i.e. how can I determine where i_mmap{,shared} exist and when does't it?
> (in file system page context).
If you do mmap, then modify the pages, then munmap or exit, your
->writpage function is sometimes called _after_ that.
That means you can get no vmas in i_mmap{,shared} when doing
perfectly normal writable shared mappings.
Even if you do find vmas, they can easily correspond to the wrong
dentries, so you'll operate on the wrong files in your stackable fs.
Your problem is that you are trying to use ->writepage for something
it doesn't do.
You should be using the ->mmap operations of file_operations instead,
and do your versioning operation at the time a writable shared mapping
is created. (The kernel does not provide a way to track when pages
are actually modified through a specific mapping). You can look in
generic_file_mmap() to see the condition which tests for a writable
shared mapping. Test that, do the versioning operation, and then call
generic_file_mmap from your own mmap function to finish the job.
-- Jamie
^ permalink raw reply [flat|nested] 22+ messages in thread
* Re: which dentry a page belongs to
2004-04-23 17:37 ` Jamie Lokier
@ 2004-04-23 17:59 ` Shaya Potter
2004-04-23 22:13 ` Jamie Lokier
2004-04-23 18:05 ` Shaya Potter
1 sibling, 1 reply; 22+ messages in thread
From: Shaya Potter @ 2004-04-23 17:59 UTC (permalink / raw)
To: Jamie Lokier; +Cc: Christoph Hellwig, linux-fsdevel
On Fri, 2004-04-23 at 18:37 +0100, Jamie Lokier wrote:
> Shaya Potter wrote:
> > > you can be in writepage with page->mapping->i_mmap{,shared} beeing empty.
> > > No way in hell you'll ever get to a dentry.
> >
> > the question being in what cases will that happen, so I can make a
> > determination if I care about those cases. (i.e. if the dentry is
> > deleted, I don't particularly care, as since I am versioning, if it's
> > already been deleted, don't care)
> >
> > i.e. how can I determine where i_mmap{,shared} exist and when does't it?
> > (in file system page context).
>
> If you do mmap, then modify the pages, then munmap or exit, your
> ->writpage function is sometimes called _after_ that.
>
> That means you can get no vmas in i_mmap{,shared} when doing
> perfectly normal writable shared mappings.
just realized an insanely ugly solution would be for all mapped files to
map inode -> dentry (with code that makes sure they are still valid, i.e
if dentry gets deleted or renamed). Since the page always has it's
inode host, could figure it out via that. Since relatively few mapped
files (compared to amount of files on fs) shouldn't be a huge memory
overhead.
don't think that's going to be my solution though :)
^ permalink raw reply [flat|nested] 22+ messages in thread
* Re: which dentry a page belongs to
2004-04-23 17:37 ` Jamie Lokier
2004-04-23 17:59 ` Shaya Potter
@ 2004-04-23 18:05 ` Shaya Potter
2004-04-23 21:37 ` Jamie Lokier
1 sibling, 1 reply; 22+ messages in thread
From: Shaya Potter @ 2004-04-23 18:05 UTC (permalink / raw)
To: Jamie Lokier; +Cc: Christoph Hellwig, linux-fsdevel
On Fri, 2004-04-23 at 18:37 +0100, Jamie Lokier wrote:
> You should be using the ->mmap operations of file_operations instead,
> and do your versioning operation at the time a writable shared mapping
> is created. (The kernel does not provide a way to track when pages
> are actually modified through a specific mapping). You can look in
> generic_file_mmap() to see the condition which tests for a writable
> shared mapping. Test that, do the versioning operation, and then call
> generic_file_mmap from your own mmap function to finish the job.
right, I know i can do that (or on lookup(), or on open() ) but that
basically limits me to a open/close transaction for versioning. Can't
ioctl the fs, and cause all future writes to be in a new version.
shaya
^ permalink raw reply [flat|nested] 22+ messages in thread
* Re: which dentry a page belongs to
2004-04-23 18:05 ` Shaya Potter
@ 2004-04-23 21:37 ` Jamie Lokier
2004-04-23 22:26 ` Shaya Potter
0 siblings, 1 reply; 22+ messages in thread
From: Jamie Lokier @ 2004-04-23 21:37 UTC (permalink / raw)
To: Shaya Potter; +Cc: Christoph Hellwig, linux-fsdevel
Shaya Potter wrote:
> right, I know i can do that (or on lookup(), or on open() ) but that
> basically limits me to a open/close transaction for versioning. Can't
> ioctl the fs, and cause all future writes to be in a new version.
"all future writes" is not very well defined with shared writable mappings.
Imagine this:
1. program write to a page, dirty bit is set in pte
[6 weeks later...]
2. you do the ioctl
3. your writepage gets called
Now you will store an update for a change which happened 6 weeks
before you called the ioctl(). Is that what you really wanted?
Perhaps it is.
-- Jamie
^ permalink raw reply [flat|nested] 22+ messages in thread
* Re: which dentry a page belongs to
2004-04-23 17:59 ` Shaya Potter
@ 2004-04-23 22:13 ` Jamie Lokier
0 siblings, 0 replies; 22+ messages in thread
From: Jamie Lokier @ 2004-04-23 22:13 UTC (permalink / raw)
To: Shaya Potter; +Cc: Christoph Hellwig, linux-fsdevel
Shaya Potter wrote:
> > That means you can get no vmas in i_mmap{,shared} when doing
> > perfectly normal writable shared mappings.
>
> just realized an insanely ugly solution would be for all mapped files to
> map inode -> dentry (with code that makes sure they are still valid, i.e
> if dentry gets deleted or renamed). Since the page always has it's
> inode host, could figure it out via that. Since relatively few mapped
> files (compared to amount of files on fs) shouldn't be a huge memory
> overhead.
It's not ugly, it's wrong. You'd end up verson-controlling the wrong files.
It's obvious that you want:
1. Program A opens ("/a/path1", O_RDWR);
1. Program B opens ("/a/path2", O_RDWR);
2. Program A calls mmap.
3. Program B calls mmap.
4. You call your fs version control ioctl.
5. Program A updates data.
6. Program A unmaps or exits, which transfers the dirty bits.
7. Later, ->writepage is called.
6. You store a versioned snapshot of "/a/path1".
Your "insanely ugly solution" could easily snapshot "/a/path2" instead.
-- Jamie
^ permalink raw reply [flat|nested] 22+ messages in thread
* Re: which dentry a page belongs to
2004-04-23 21:37 ` Jamie Lokier
@ 2004-04-23 22:26 ` Shaya Potter
2004-04-23 22:49 ` Jamie Lokier
0 siblings, 1 reply; 22+ messages in thread
From: Shaya Potter @ 2004-04-23 22:26 UTC (permalink / raw)
To: Jamie Lokier; +Cc: Christoph Hellwig, linux-fsdevel
On Fri, 2004-04-23 at 22:37 +0100, Jamie Lokier wrote:
> Shaya Potter wrote:
> > right, I know i can do that (or on lookup(), or on open() ) but that
> > basically limits me to a open/close transaction for versioning. Can't
> > ioctl the fs, and cause all future writes to be in a new version.
>
> "all future writes" is not very well defined with shared writable mappings.
>
> Imagine this:
>
> 1. program write to a page, dirty bit is set in pte
>
> [6 weeks later...]
>
> 2. you do the ioctl
> 3. your writepage gets called
>
> Now you will store an update for a change which happened 6 weeks
> before you called the ioctl(). Is that what you really wanted?
> Perhaps it is.
OK, great example, now I see the problem which can't be easily solved.
update-data
ioctl
update-data
writepage()
writepage can't differentiate b/w data b4 and data after, so only way to
do it would be to force a sync b4 ioctl is called, which I would think
(perhaps wrong) should write out all the data to disk. But unsure if
prorgramatically able to really test to see if it worked or not.
shaya
^ permalink raw reply [flat|nested] 22+ messages in thread
* Re: which dentry a page belongs to
2004-04-23 22:26 ` Shaya Potter
@ 2004-04-23 22:49 ` Jamie Lokier
2004-04-25 5:23 ` Shaya Potter
0 siblings, 1 reply; 22+ messages in thread
From: Jamie Lokier @ 2004-04-23 22:49 UTC (permalink / raw)
To: Shaya Potter; +Cc: Christoph Hellwig, linux-fsdevel
Shaya Potter wrote:
> OK, great example, now I see the problem which can't be easily solved.
>
> update-data
> ioctl
> update-data
> writepage()
>
> writepage can't differentiate b/w data b4 and data after, so only way to
> do it would be to force a sync b4 ioctl is called, which I would think
> (perhaps wrong) should write out all the data to disk. But unsure if
> prorgramatically able to really test to see if it worked or not.
"sync" doesn't do anything to help.
Read up on msync() until you understand what is intended here.
In general, first "update-data" in your example doesn't count as a
"write" and it wouldn't be logical for a program to expect it to be
snapshotted. It counts as the program queuing up data which it does
not expect to be visible in the file until the next msync, munmap or
exit (although it is visible to other concurrent mmaps, and it _might_
be visible in the file, of parts of it might).
-- Jamie
^ permalink raw reply [flat|nested] 22+ messages in thread
* Re: which dentry a page belongs to
2004-04-23 16:37 ` Christoph Hellwig
2004-04-23 16:52 ` Shaya Potter
@ 2004-04-24 8:44 ` Jan Hudec
2004-04-24 9:20 ` Christoph Hellwig
1 sibling, 1 reply; 22+ messages in thread
From: Jan Hudec @ 2004-04-24 8:44 UTC (permalink / raw)
To: Christoph Hellwig; +Cc: Shaya Potter, Jamie Lokier, linux-fsdevel
[-- Attachment #1: Type: text/plain, Size: 548 bytes --]
On Fri, Apr 23, 2004 at 17:37:38 +0100, Christoph Hellwig wrote:
> in 2.4 writepage is always the result of data dirtied by mmap. In 2.6 it's
> also for use for data dirtied by write. Even in 2.4 there's no gurantee
> the mapping that dirtied the page still exists when the page is written out
> by the VM.
No, It's the same in 2.4 and 2.6 -- both may use writepage for write
(depends on how you implement your commit_write).
-------------------------------------------------------------------------------
Jan 'Bulb' Hudec <bulb@ucw.cz>
[-- Attachment #2: Digital signature --]
[-- Type: application/pgp-signature, Size: 189 bytes --]
^ permalink raw reply [flat|nested] 22+ messages in thread
* Re: which dentry a page belongs to
2004-04-23 17:01 ` Christoph Hellwig
2004-04-23 17:18 ` Shaya Potter
@ 2004-04-24 8:53 ` Jan Hudec
1 sibling, 0 replies; 22+ messages in thread
From: Jan Hudec @ 2004-04-24 8:53 UTC (permalink / raw)
To: Christoph Hellwig; +Cc: Shaya Potter, Jamie Lokier, linux-fsdevel
[-- Attachment #1: Type: text/plain, Size: 1623 bytes --]
On Fri, Apr 23, 2004 at 18:01:30 +0100, Christoph Hellwig wrote:
> On Fri, Apr 23, 2004 at 12:52:54PM -0400, Shaya Potter wrote:
> > > in 2.4 writepage is always the result of data dirtied by mmap. In 2.6 it's
> > > also for use for data dirtied by write. Even in 2.4 there's no gurantee
> > > the mapping that dirtied the page still exists when the page is written out
> > > by the VM.
> >
> > so the mapping off the page struct will be null?
>
> No. but i_mmap and i_mmap_shared might be empty.
>
> > so is that a yes to my understanding of "multiple dentries" i.e. a
> > single inode linked into multiple places in the fs.
>
> Yes.
>
> > > > > It's also possible to find no dentries at all.
> > > >
> > > > even if in my fs's writepage() function?
> > >
> > > Yes.
> >
> > so the vm_file part of vm_area_struct will be null?
>
> You won't find a vm_area_struct at all in that case.
>
> > > You can't do that.
> >
> > Yes, I know it's evil to do and would never be accepted into the kernel
> > proper, the question I'm trying to figure out if it it's possible (with
> > a stackable fs) to version files on write.
>
> It's not evil is's fucking impossible for gods sake.
>
> you can be in writepage with page->mapping->i_mmap{,shared} beeing empty.
> No way in hell you'll ever get to a dentry.
page->mapping->host->i_dentries and first item there? Of course, inode
can survive it's last dentry. But you can make sure it won't survive
dirty!
-------------------------------------------------------------------------------
Jan 'Bulb' Hudec <bulb@ucw.cz>
[-- Attachment #2: Digital signature --]
[-- Type: application/pgp-signature, Size: 189 bytes --]
^ permalink raw reply [flat|nested] 22+ messages in thread
* Re: which dentry a page belongs to
2004-04-24 8:44 ` Jan Hudec
@ 2004-04-24 9:20 ` Christoph Hellwig
2004-04-24 9:32 ` Jan Hudec
0 siblings, 1 reply; 22+ messages in thread
From: Christoph Hellwig @ 2004-04-24 9:20 UTC (permalink / raw)
To: Jan Hudec; +Cc: Shaya Potter, Jamie Lokier, linux-fsdevel
On Sat, Apr 24, 2004 at 10:44:46AM +0200, Jan Hudec wrote:
> On Fri, Apr 23, 2004 at 17:37:38 +0100, Christoph Hellwig wrote:
> > in 2.4 writepage is always the result of data dirtied by mmap. In 2.6 it's
> > also for use for data dirtied by write. Even in 2.4 there's no gurantee
> > the mapping that dirtied the page still exists when the page is written out
> > by the VM.
>
> No, It's the same in 2.4 and 2.6 -- both may use writepage for write
> (depends on how you implement your commit_write).
Well, okay. At least xfs uses writepage and some network filesystems do
aswell. The filesystems using generic fs/buffer.c routines don't use
writepage at least..
^ permalink raw reply [flat|nested] 22+ messages in thread
* Re: which dentry a page belongs to
2004-04-24 9:20 ` Christoph Hellwig
@ 2004-04-24 9:32 ` Jan Hudec
0 siblings, 0 replies; 22+ messages in thread
From: Jan Hudec @ 2004-04-24 9:32 UTC (permalink / raw)
To: Christoph Hellwig; +Cc: Shaya Potter, Jamie Lokier, linux-fsdevel
[-- Attachment #1: Type: text/plain, Size: 1117 bytes --]
On Sat, Apr 24, 2004 at 10:20:56 +0100, Christoph Hellwig wrote:
> On Sat, Apr 24, 2004 at 10:44:46AM +0200, Jan Hudec wrote:
> > On Fri, Apr 23, 2004 at 17:37:38 +0100, Christoph Hellwig wrote:
> > > in 2.4 writepage is always the result of data dirtied by mmap. In 2.6 it's
> > > also for use for data dirtied by write. Even in 2.4 there's no gurantee
> > > the mapping that dirtied the page still exists when the page is written out
> > > by the VM.
> >
> > No, It's the same in 2.4 and 2.6 -- both may use writepage for write
> > (depends on how you implement your commit_write).
>
> Well, okay. At least xfs uses writepage and some network filesystems do
> aswell. The filesystems using generic fs/buffer.c routines don't use
> writepage at least..
Looked at ext2 code... OK, in 2.4 the generic_commit_write goes right to
buffer cache. So filesystems using it do not use writepage. But
filesystems may set_page_dirty in commit_write and that would mean
writepage.
-------------------------------------------------------------------------------
Jan 'Bulb' Hudec <bulb@ucw.cz>
[-- Attachment #2: Digital signature --]
[-- Type: application/pgp-signature, Size: 189 bytes --]
^ permalink raw reply [flat|nested] 22+ messages in thread
* Re: which dentry a page belongs to
2004-04-23 22:49 ` Jamie Lokier
@ 2004-04-25 5:23 ` Shaya Potter
2004-04-25 23:22 ` Erez Zadok
0 siblings, 1 reply; 22+ messages in thread
From: Shaya Potter @ 2004-04-25 5:23 UTC (permalink / raw)
To: Jamie Lokier; +Cc: Christoph Hellwig, linux-fsdevel
On Fri, 2004-04-23 at 23:49 +0100, Jamie Lokier wrote:
> Shaya Potter wrote:
> > OK, great example, now I see the problem which can't be easily solved.
> >
> > update-data
> > ioctl
> > update-data
> > writepage()
> >
> > writepage can't differentiate b/w data b4 and data after, so only way to
> > do it would be to force a sync b4 ioctl is called, which I would think
> > (perhaps wrong) should write out all the data to disk. But unsure if
> > prorgramatically able to really test to see if it worked or not.
>
> "sync" doesn't do anything to help.
right.
> In general, first "update-data" in your example doesn't count as a
> "write" and it wouldn't be logical for a program to expect it to be
> snapshotted. It counts as the program queuing up data which it does
> not expect to be visible in the file until the next msync, munmap or
> exit (although it is visible to other concurrent mmaps, and it _might_
> be visible in the file, of parts of it might).
right.
But as I just realized, for my purposes it doesn't matter much. The
file system I'm working on is to go together with ZAP, which is our
kernel based process checkpoint/restart/migration infrastructure. We
wanted a versioning fs to go with it so we aren't stuck just doing
checkpoint ; restart ; checkpoint ; restart, but could checkpoint many
times in a row and restart many times from a single arbitrary checkpoint
(not just the last one)
One thing we have to do, even right now, is checkpoint all dirty pages
associated with a process. Hence the right thing to do after a
snapshot/version would be to version immediately in writepage() as the
process checkpoint code will have taken care of saving the dirty pages
(and the restart code will take care of restoring them and marking them
dirty).
though I'm guessing it be a slightly different story for write() if that
uses the writepage() interface as the pages will be "anonymous" (i.e.
not mapped inside a process)
^ permalink raw reply [flat|nested] 22+ messages in thread
* Re: which dentry a page belongs to
2004-04-25 5:23 ` Shaya Potter
@ 2004-04-25 23:22 ` Erez Zadok
0 siblings, 0 replies; 22+ messages in thread
From: Erez Zadok @ 2004-04-25 23:22 UTC (permalink / raw)
To: Shaya Potter; +Cc: linux-fsdevel
Shaya,
Let me describe an ugly solution we had to do for the SCA stacked file
systems, esp. gzipfs (Usenix 2001 paper, see www.filesystems.org). In
gzipfs, for every file F we keep a file F.idx that includes an index mapping
of compressed to uncompressed pages. Alas, ->writepage only gives us a
page, from which we can get the inode, and not the name.
What people have told you on this list is correct: you cannot assume you'll
have a dentry to get back from the inode, or you might have two or more,
etc.
The way we got a dentry from an inode, in gzipfs, is that on file ->open,
once we get a fully-instantiated dentry+inode, we store a dentry inside the
*private* field of the inode in question, w/ the dentry's refcnt increased.
Then in ->writepage we can get to the dentry. We have to carefully ensure
that we discard that dentry and inode properly elsewhere. Of course, this
is ugly, but it worked for a restricted set of circumstances in which the
file is likely to be opened first, giving us a name to store in the inode.
It does NOT work for hard-linked files, and we have special code that alerts
us if of such conditions so we don't try to stuff a new dentry in our
inode's private data. I'm not pleased w/ this solution, but sans of major
VFS surgery, we couldn't see an easier solution. Fundamentally, the problem
that I and you and others have come across time and time again has to do
with (unfortunate) unix semantics that say that given an inode number, you
cannot efficiently get to one or all of that inode's names. (We have an
efficient and portable in-kernel solution for this problem -- details
*after* 5/26 :-)
Cheers,
Erez.
^ permalink raw reply [flat|nested] 22+ messages in thread
end of thread, other threads:[~2004-04-25 23:23 UTC | newest]
Thread overview: 22+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2004-04-23 14:57 which dentry a page belongs to Shaya Potter
2004-04-23 15:14 ` Jamie Lokier
2004-04-23 15:42 ` Shaya Potter
2004-04-23 16:37 ` Christoph Hellwig
2004-04-23 16:52 ` Shaya Potter
2004-04-23 17:01 ` Christoph Hellwig
2004-04-23 17:18 ` Shaya Potter
2004-04-23 17:22 ` Christoph Hellwig
2004-04-23 17:32 ` Shaya Potter
2004-04-23 17:37 ` Jamie Lokier
2004-04-23 17:59 ` Shaya Potter
2004-04-23 22:13 ` Jamie Lokier
2004-04-23 18:05 ` Shaya Potter
2004-04-23 21:37 ` Jamie Lokier
2004-04-23 22:26 ` Shaya Potter
2004-04-23 22:49 ` Jamie Lokier
2004-04-25 5:23 ` Shaya Potter
2004-04-25 23:22 ` Erez Zadok
2004-04-24 8:53 ` Jan Hudec
2004-04-24 8:44 ` Jan Hudec
2004-04-24 9:20 ` Christoph Hellwig
2004-04-24 9:32 ` Jan Hudec
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox