public inbox for linux-kernel@vger.kernel.org
 help / color / mirror / Atom feed
From: Lutz Vieweg <lkv@isg.de>
To: Robin Holt <holt@sgi.com>
Cc: linux-kernel@vger.kernel.org
Subject: Re: How to find out which pages were copied-on-write?
Date: Fri, 09 Jul 2004 22:42:46 +0200	[thread overview]
Message-ID: <40EF0346.4040407@isg.de> (raw)
In-Reply-To: <20040709113125.GA8897@lnx-holt.americas.sgi.com>

Robin Holt wrote:
> OK, now that I am considering this problem,  I am trying to figure out
> what problem we are trying to solve.
> 
> By reading your email, I gather that you have a single threaded
> application which is doing an mmap on a file as a MAP_PRIVATE mapping.
> The memory area is then handed to a library which may modify some pages.
> You want to decide after the return if you had success and thereby
> control the writing of the updated data back to the file.  Because of
> the size of the file, doing a second mapping and comparing/copying pages
> is unreasonable and you would like to only modify the pages that have
> actually changed.

That's about it, the most important issue is that I want to avoid
having an inconsistent file on the disk for long periods, because a)
the application could crash and b) another process might want to map
the same file (read-only). And since the application is reaching points
where the data is consistent, while it is not in between, it would be nice
to have a private mapping while it is inconsistent and commit the changes
only at the points where the application knows the data is consistent.

Turning a private into a shared mapping would be a perfect solution
since that would mean another process could map the file at any time
and find consistent data. The second best solution would be the
one where the application just manually writes out the changed pages
at the time of consistence, this would at least reduce the times when the
data on disk is inconsistent to a minimum.



>>Yet another feature that I could use if it were available:
>>A "copy-on-read"-mapping. There, a page would become a private
>>copy of a process once _another_ process wrote data to the
>>corresponding file location. But I suspect that feature
>>could be very hard to implement...
> 
> This is a different way of thinking of copy-on-write.  I believe you
> are thinking of the time when there are two processes sharing the page.
> When one process takes the write fault, the page is copied and the by that
> process and the other process becomes the exclusive owner of the page.

A little different: Think of N processes (N may be 8 or so) that mmap()
a file using a new mode "MAP_SNAPSHOT" (which could be read-only if a mix
with private copy-on-write pages was too hard to realize), and 1 process
mmap()ing the same file using MAP_SHARED. Once the N processes mmap()ed
the file using MAP_SNAPSHOT, their "view" of the file content would never
change, that is, if the one process that mmap()ed the file with MAP_SHARED
writes to a page, that page _is_ written to disk the usual way, but the
other N processes get a copy of the page before it has been changed, so
they will always see the same data.
Once the processes that mmap()ed using MAP_SNAPSHOT unmap the file, the
copies of the pages that were changed on disk are simply discarded.

That would - similar to the features mentioned above - allow one process
to efficiently work on portions of a huge file over a longer period of time,
and only at times when the file in total contains consistent data, other
processes could be instructed to mmap() them again to obtain a newer version.


Regards,

Lutz Vieweg



  reply	other threads:[~2004-07-09 20:43 UTC|newest]

Thread overview: 10+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2004-07-06 15:58 How to find out which pages were copied-on-write? Lutz Vieweg
2004-07-09 11:31 ` Robin Holt
2004-07-09 20:42   ` Lutz Vieweg [this message]
2004-07-10  8:11     ` Michael Clark
2004-07-12 17:21       ` Lutz Vieweg
2004-07-13  4:16         ` Michael Clark
2004-07-13 13:04           ` Lutz Vieweg
2004-07-13 15:02             ` Michael Clark
2004-07-13 15:39               ` Lutz Vieweg
2004-07-14  0:25                 ` Michael Clark

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=40EF0346.4040407@isg.de \
    --to=lkv@isg.de \
    --cc=holt@sgi.com \
    --cc=linux-kernel@vger.kernel.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox