linux-fsdevel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* Writing out a (file) mmapped page
@ 2005-06-12 16:49 Martin Jambor
  2005-06-13  2:32 ` Coywolf Qi Hunt
  2005-06-14 15:36 ` Anton Altaparmakov
  0 siblings, 2 replies; 5+ messages in thread
From: Martin Jambor @ 2005-06-12 16:49 UTC (permalink / raw)
  To: linux-fsdevel

Hi,

I have spent a few hours trying to find out how dirty mmapped pages
are written out in filesystems using the "generic" functions but so
far I have not been successful. The main thing that escapes me is the
following:

block_write_full_page() writes out only buffers marked dirty or whole
page when there are no buffers associated with it. Where in kernel are
buffers either marked dirty or stripped off a mmaped page when the
page itself becomes dirty?  I would be very grateful for a pointer to
the source, possibly accompanied by a brief explanation of how it gets
called.

One comment in buffer.c suggests aops->prepare_write is called by a
pagefault handler for mmaped pages but I found no such call (using
cscope).

Thank you very much for any comment on this,

Martin Jambor

^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: Writing out a (file) mmapped page
  2005-06-12 16:49 Writing out a (file) mmapped page Martin Jambor
@ 2005-06-13  2:32 ` Coywolf Qi Hunt
  2005-06-13  6:57   ` Jörn Engel
  2005-06-14 15:36 ` Anton Altaparmakov
  1 sibling, 1 reply; 5+ messages in thread
From: Coywolf Qi Hunt @ 2005-06-13  2:32 UTC (permalink / raw)
  To: Martin Jambor; +Cc: linux-fsdevel

On 6/13/05, Martin Jambor <jamborm@gmail.com> wrote:
> Hi,
> 
> I have spent a few hours trying to find out how dirty mmapped pages
> are written out in filesystems using the "generic" functions but so
> far I have not been successful. The main thing that escapes me is the
> following:
> 
> block_write_full_page() writes out only buffers marked dirty or whole
> page when there are no buffers associated with it. Where in kernel are
> buffers either marked dirty or stripped off a mmaped page when the
> page itself becomes dirty?  I would be very grateful for a pointer to
> the source, possibly accompanied by a brief explanation of how it gets
> called.
> 
> One comment in buffer.c suggests aops->prepare_write is called by a
> pagefault handler for mmaped pages but I found no such call (using
> cscope).

generic_file_buffered_write() calls a_ops->prepare_write().

-- 
Coywolf Qi Hunt
http://ahbl.org/~coywolf/

^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: Writing out a (file) mmapped page
  2005-06-13  2:32 ` Coywolf Qi Hunt
@ 2005-06-13  6:57   ` Jörn Engel
  0 siblings, 0 replies; 5+ messages in thread
From: Jörn Engel @ 2005-06-13  6:57 UTC (permalink / raw)
  To: Coywolf Qi Hunt; +Cc: Martin Jambor, linux-fsdevel

On Mon, 13 June 2005 10:32:50 +0800, Coywolf Qi Hunt wrote:
> On 6/13/05, Martin Jambor <jamborm@gmail.com> wrote:
> > 
> > I have spent a few hours trying to find out how dirty mmapped pages
> > are written out in filesystems using the "generic" functions but so
> > far I have not been successful. The main thing that escapes me is the
> > following:
> > 
> > block_write_full_page() writes out only buffers marked dirty or whole
> > page when there are no buffers associated with it. Where in kernel are
> > buffers either marked dirty or stripped off a mmaped page when the
> > page itself becomes dirty?  I would be very grateful for a pointer to
> > the source, possibly accompanied by a brief explanation of how it gets
> > called.
> > 
> > One comment in buffer.c suggests aops->prepare_write is called by a
> > pagefault handler for mmaped pages but I found no such call (using
> > cscope).
> 
> generic_file_buffered_write() calls a_ops->prepare_write().

That's the write path, not mmap.  writepage() is your friend.

Jörn

-- 
Debugging is twice as hard as writing the code in the first place.
Therefore, if you write the code as cleverly as possible, you are,
by definition, not smart enough to debug it.
-- Brian W. Kernighan
-
To unsubscribe from this list: send the line "unsubscribe linux-fsdevel" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: Writing out a (file) mmapped page
  2005-06-12 16:49 Writing out a (file) mmapped page Martin Jambor
  2005-06-13  2:32 ` Coywolf Qi Hunt
@ 2005-06-14 15:36 ` Anton Altaparmakov
  2005-06-14 16:37   ` Nikita Danilov
  1 sibling, 1 reply; 5+ messages in thread
From: Anton Altaparmakov @ 2005-06-14 15:36 UTC (permalink / raw)
  To: Martin Jambor; +Cc: linux-fsdevel

Hi,

DISCLAIMER:  I am no vm expert and the below is just my take on things.
And note it is i386 specific, I known next to nothing about the insides
of other architectures...  I am sure that if I have described anything
wrong someone who knows better will correct me.  (-:

On Sun, 2005-06-12 at 18:49 +0200, Martin Jambor wrote:
> I have spent a few hours trying to find out how dirty mmapped pages
> are written out in filesystems using the "generic" functions but so
> far I have not been successful. The main thing that escapes me is the
> following:
> 
> block_write_full_page() writes out only buffers marked dirty or whole
> page when there are no buffers associated with it. Where in kernel are
> buffers either marked dirty or stripped off a mmaped page when the
> page itself becomes dirty?  I would be very grateful for a pointer to
> the source, possibly accompanied by a brief explanation of how it gets
> called.

That is a little trickier than one might expect because it is partially
done in hardware (heavily arch dependent though).

When a write to a writable mmapped page happens the CPU sets the page
dirty flag in hardware.  So there is no code where you can see this
happen.  Later on, at msync() time or munmap() time, this hardware dirty
bit results in a set_page_dirty() being called which for buffer based
filesystems will be a __set_page_dirty_buffers() which will dirty both
the buffers and the page.  For example take the msync case:

mm/msync.c::sys_msync ->
	msync_interval ->
		filemap_sync->
			sync_page_range->
				sync_pud_range ->
					sync_pmd_range ->
						sync_pte_range ->
							set_page_dirty

The beauty of short inline functions.  (-:

When the page is not writable, then a page fault occurs and the generic
page fault handler is called:

arch/i386/kernel/entry.S defines the page_fault entry:

ENTRY(page_fault)
        pushl $do_page_fault
        jmp error_code

arch/i386/mm/fault.c::do_page_fault -> (We now move to non-arch specific
code.)
	mm/memory.c::handle_mm_fault ->
		handle_pte_fault

Now the code paths diverge depending on what page the fault is hitting
(shared or not, read-only or not, swap page or not, page not present in
memory at all, etc).

In the end all cases end up generating a writable page (see
pte_mkwrite), either by modifying the existing one, or by copying the
existing shared page - also known as COW = Copy On Write, or by creating
a new page if none was present at all, and doing a pte_mkdirty (or an
equivalent thereof).

The pte_mkdirty causes the page to be dirty in the page tables and this
gets caught at msync/munmap time and results in set_page_dirty being
called as described above.  (Note the pte_mkdirty results in exactly the
same thing happening to the page as when the cpu does it in the
writable-page-already-present case I described above.)

Best regards,

        Anton
-- 
Anton Altaparmakov <aia21 at cam.ac.uk> (replace at with @)
Unix Support, Computing Service, University of Cambridge, CB2 3QH, UK
Linux NTFS maintainer / IRC: #ntfs on irc.freenode.net
WWW: http://linux-ntfs.sf.net/ & http://www-stu.christs.cam.ac.uk/~aia21/


^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: Writing out a (file) mmapped page
  2005-06-14 15:36 ` Anton Altaparmakov
@ 2005-06-14 16:37   ` Nikita Danilov
  0 siblings, 0 replies; 5+ messages in thread
From: Nikita Danilov @ 2005-06-14 16:37 UTC (permalink / raw)
  To: Anton Altaparmakov; +Cc: linux-fsdevel

Anton Altaparmakov writes:

[...]

 > 
 > That is a little trickier than one might expect because it is partially
 > done in hardware (heavily arch dependent though).
 > 
 > When a write to a writable mmapped page happens the CPU sets the page
 > dirty flag in hardware.  So there is no code where you can see this
 > happen.  Later on, at msync() time or munmap() time, this hardware dirty
 > bit results in a set_page_dirty() being called which for buffer based

This also happens when page migrates to the cold end of "VM LRU list"
(tail of the inactive list, technically).

Nikita.

^ permalink raw reply	[flat|nested] 5+ messages in thread

end of thread, other threads:[~2005-06-14 16:37 UTC | newest]

Thread overview: 5+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2005-06-12 16:49 Writing out a (file) mmapped page Martin Jambor
2005-06-13  2:32 ` Coywolf Qi Hunt
2005-06-13  6:57   ` Jörn Engel
2005-06-14 15:36 ` Anton Altaparmakov
2005-06-14 16:37   ` Nikita Danilov

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).