All of lore.kernel.org
 help / color / mirror / Atom feed
From: chrisl@vmware.com
To: Andrew Morton <akpm@digeo.com>
Cc: andrea@suse.de, linux-kernel@vger.kernel.org,
	chrisl@gnuchina.org, Alan Cox <alan@lxorguk.ukuu.org.uk>
Subject: Re: writepage return value check in vmscan.c
Date: Thu, 24 Oct 2002 10:57:18 -0700	[thread overview]
Message-ID: <20021024175718.GA1398@vmware.com> (raw)
In-Reply-To: <3DB7B11B.9E552CFF@digeo.com>

On Thu, Oct 24, 2002 at 01:36:43AM -0700, Andrew Morton wrote:
> 
> Yup.  If the kernel cannot write back your MAP_SHARED page due to
> ENOSPC it throws your data away.
> 
> The alternative would be to allow you to pin an arbitrary amount of
> unpageable memory.

I know the error handling in mmaped memory is poor. But I am not talking
about that one. There are two place the mmaped memory can flush back
to disk. The one you are talking about is filemap_fdatasync() in filemap.c.
It will be called when try to unmmap or sync back to the disk. It at least
check the err when writepage fail. But it still clear the page dirty anyway,
looks bad to me.

The one I am complaining about is in vmscan.c, when kswapd try to
shink the cache. Correct me if I am wrong. kswapd will flush some mmaped
page back to disk to release some page from page cache. Even when
write page fail. Kernel will still do:

ClearPageDirty(page);
SetPageLaunder(page);

for that page. So this page will be able to used by other process.
When it have a missing page fault, it will read back the WRONG
data from the disk and cause a memory corruption.

So I am expecting something like tihs:

			if ((gfp_mask & __GFP_FS) && writepage) {
+                               unsigned long flags = page->flags;

				ClearPageDirty(page);
				SetPageLaunder(page);
				page_cache_get(page);
				spin_unlock(&pagemap_lru_lock);

-				writepage(page);
+				if (writepage(page))
+					page->flags = flags;

				page_cache_release(page);

				spin_lock(&pagemap_lru_lock);
				continue;
                        }

> 
> A few fixes have been discussed.  One way would be to allocate
> the space for the page when it is first faulted into reality and
> deliver SIGBUS if backing store for it could not be allocated.

I am not sure how the user program handle that signal...

>  
> Ayup.  MAP_SHARED is a crock.  If you want to write to a file, use write().
> 
> View MAP_SHARED as a tool by which separate processes can attach
> to some shared memory which is identified by the filesystem namespace.
> It's not a very good way of performing I/O.

That is exactly the case for vmware ram file. VMware only use it to share
memory. Those are the virtual machine's memory. We don't want to write
it back to disk and we don't care what is left on the file system because
when vmware exit, we will throw the guest ram data away just like a real
machine power off ram will lost. We are not talking about machine using
flash ram :-). 

It is kswapd try to flush the data and it should take response to handle
the error. If it fail, one thing it should do is keep the page dirty
if write back fail. At least not corrupt memory like that.

If we can deliver the error to user program that would be a plus.
But this need to be fix frist.

Chris



  parent reply	other threads:[~2002-10-24 17:50 UTC|newest]

Thread overview: 36+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2002-10-24  8:25 writepage return value check in vmscan.c chrisl
2002-10-24  8:36 ` Andrew Morton
2002-10-24  9:15   ` Alan Cox
2002-10-24 11:44     ` Andrea Arcangeli
2002-10-24 16:12       ` Andrew Morton
2002-10-24 17:59     ` chrisl
2002-10-24 11:31   ` Andrea Arcangeli
2002-10-24 18:30     ` chrisl
2002-10-24 18:40       ` Andrea Arcangeli
2002-10-24 19:14         ` Rik van Riel
2002-10-24 19:25           ` Andrew Morton
2002-10-24 17:57   ` chrisl [this message]
2002-10-24 18:33     ` Andrea Arcangeli
2002-10-24 19:15       ` chrisl
2002-10-24 20:41         ` Andrea Arcangeli
2002-10-24 21:17           ` chrisl
2002-10-24 20:46         ` Andrew Morton
2002-10-24 21:23           ` chrisl
2002-10-24 21:29             ` Andrew Morton
2002-10-25 16:11               ` Paul Larson
2002-10-25 16:31                 ` Christoph Hellwig
2002-10-25 17:07                 ` Rik van Riel
2002-10-25 18:44         ` Andrew Morton
2002-10-28 19:17           ` chrisl
2002-10-28 19:53             ` Andrew Morton
2002-10-28 20:38               ` chrisl
2002-10-28 21:14               ` Andrea Arcangeli
2002-10-28  8:28         ` Christoph Rohland
2002-10-28 18:44           ` chrisl
2002-10-28 19:22             ` Andrea Arcangeli
2002-10-28 19:29               ` chrisl
2002-10-29  6:10               ` Randy.Dunlap
2002-10-29  7:08                 ` Andreas Dilger
2002-10-28 19:58       ` chrisl
2002-10-28 21:32         ` Andrea Arcangeli
2002-10-30  4:13           ` chrisl

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20021024175718.GA1398@vmware.com \
    --to=chrisl@vmware.com \
    --cc=akpm@digeo.com \
    --cc=alan@lxorguk.ukuu.org.uk \
    --cc=andrea@suse.de \
    --cc=chrisl@gnuchina.org \
    --cc=linux-kernel@vger.kernel.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.