public inbox for linux-kernel@vger.kernel.org
 help / color / mirror / Atom feed
From: Michael Clark <michael@metaparadigm.com>
To: Lutz Vieweg <lkv@isg.de>
Cc: Robin Holt <holt@sgi.com>, linux-kernel@vger.kernel.org
Subject: Re: How to find out which pages were copied-on-write?
Date: Sat, 10 Jul 2004 16:11:52 +0800	[thread overview]
Message-ID: <40EFA4C8.1050409@metaparadigm.com> (raw)
In-Reply-To: <40EF0346.4040407@isg.de>

HPAs library LPSM sounds like what you're looking for.

http://freshmeat.net/projects/lpsm/

Or you can do what you want the hard way using mprotect and a SEGV handler.

~mc

On 07/10/04 04:42, Lutz Vieweg wrote:
> Robin Holt wrote:
> 
>> OK, now that I am considering this problem,  I am trying to figure out
>> what problem we are trying to solve.
>>
>> By reading your email, I gather that you have a single threaded
>> application which is doing an mmap on a file as a MAP_PRIVATE mapping.
>> The memory area is then handed to a library which may modify some pages.
>> You want to decide after the return if you had success and thereby
>> control the writing of the updated data back to the file.  Because of
>> the size of the file, doing a second mapping and comparing/copying pages
>> is unreasonable and you would like to only modify the pages that have
>> actually changed.
> 
> 
> That's about it, the most important issue is that I want to avoid
> having an inconsistent file on the disk for long periods, because a)
> the application could crash and b) another process might want to map
> the same file (read-only). And since the application is reaching points
> where the data is consistent, while it is not in between, it would be nice
> to have a private mapping while it is inconsistent and commit the changes
> only at the points where the application knows the data is consistent.
> 
> Turning a private into a shared mapping would be a perfect solution
> since that would mean another process could map the file at any time
> and find consistent data. The second best solution would be the
> one where the application just manually writes out the changed pages
> at the time of consistence, this would at least reduce the times when the
> data on disk is inconsistent to a minimum.
> 
> 
> 
>>> Yet another feature that I could use if it were available:
>>> A "copy-on-read"-mapping. There, a page would become a private
>>> copy of a process once _another_ process wrote data to the
>>> corresponding file location. But I suspect that feature
>>> could be very hard to implement...
>>
>>
>> This is a different way of thinking of copy-on-write.  I believe you
>> are thinking of the time when there are two processes sharing the page.
>> When one process takes the write fault, the page is copied and the by 
>> that
>> process and the other process becomes the exclusive owner of the page.
> 
> 
> A little different: Think of N processes (N may be 8 or so) that mmap()
> a file using a new mode "MAP_SNAPSHOT" (which could be read-only if a mix
> with private copy-on-write pages was too hard to realize), and 1 process
> mmap()ing the same file using MAP_SHARED. Once the N processes mmap()ed
> the file using MAP_SNAPSHOT, their "view" of the file content would never
> change, that is, if the one process that mmap()ed the file with MAP_SHARED
> writes to a page, that page _is_ written to disk the usual way, but the
> other N processes get a copy of the page before it has been changed, so
> they will always see the same data.
> Once the processes that mmap()ed using MAP_SNAPSHOT unmap the file, the
> copies of the pages that were changed on disk are simply discarded.
> 
> That would - similar to the features mentioned above - allow one process
> to efficiently work on portions of a huge file over a longer period of 
> time,
> and only at times when the file in total contains consistent data, other
> processes could be instructed to mmap() them again to obtain a newer 
> version.
> 
> 
> Regards,
> 
> Lutz Vieweg
> 
> 
> -
> To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
> the body of a message to majordomo@vger.kernel.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html
> Please read the FAQ at  http://www.tux.org/lkml/

-- 
Michael Clark,  . . . . . . . . . . . .  michael@metaparadigm.com
Metaparadigm Pte. Ltd . . . . . . . . http://www.metaparadigm.com

                    "Explore Operations Research:
          The Science of Better at www.scienceofbetter.org "

  reply	other threads:[~2004-07-10  8:12 UTC|newest]

Thread overview: 10+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2004-07-06 15:58 How to find out which pages were copied-on-write? Lutz Vieweg
2004-07-09 11:31 ` Robin Holt
2004-07-09 20:42   ` Lutz Vieweg
2004-07-10  8:11     ` Michael Clark [this message]
2004-07-12 17:21       ` Lutz Vieweg
2004-07-13  4:16         ` Michael Clark
2004-07-13 13:04           ` Lutz Vieweg
2004-07-13 15:02             ` Michael Clark
2004-07-13 15:39               ` Lutz Vieweg
2004-07-14  0:25                 ` Michael Clark

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=40EFA4C8.1050409@metaparadigm.com \
    --to=michael@metaparadigm.com \
    --cc=holt@sgi.com \
    --cc=linux-kernel@vger.kernel.org \
    --cc=lkv@isg.de \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox