All of lore.kernel.org
 help / color / mirror / Atom feed
From: Andrew Morton <akpm@osdl.org>
To: Andrey Savochkin <saw@saw.sw.com.sg>
Cc: linux-kernel@vger.kernel.org
Subject: Re: Q about pagecache data never written to disk
Date: Sun, 5 Sep 2004 14:00:40 -0700	[thread overview]
Message-ID: <20040905140040.58a5fcdc.akpm@osdl.org> (raw)
In-Reply-To: <20040905154336.B9202@castle.nmd.msu.ru>

Andrey Savochkin <saw@saw.sw.com.sg> wrote:
>
> Hi Andrew,
> 
> On Sun, Sep 05, 2004 at 03:52:33AM -0700, Andrew Morton wrote:
> > Andrey Savochkin <saw@saw.sw.com.sg> wrote:
> > >
> > > Let's suppose an mmap'ed (SHARED, RW) file has a hole.
> > >  AFAICS, we allow to dirty the file pages without allocating the space for the
> > >  hole - filemap_nopage just "reads" the page filling it with zeroes, and
> > >  nothing is done about the on-disk data until writepage.
> > > 
> > >  So, if the page can't be written to disk (no space), the dirty data just
> > >  stays in the pagecache.  The data can be read or seen via mmap, but it isn't
> > >  and never be on disk.  The pagecache stays unsynchronized with the on-disk
> > >  content forever.
> > 
> > The kernel will make one attampt to write the data to disk.  If that write
> > hits ENOSPC, the page is not redirtied (ie: the data can be lost).
> > 
> > When that write hits ENOSPC an error flag is set in the address_space and
> > that will be returned from a subsequent msync().  The application will then
> > need to do something about it.
> > 
> > If your application doesn't msync() the memory then it doesn't care about
> > its data anyway.  If your application _does_ msync the pages then we
> > reliably report errors.
> 
> This question came to my mind when I was thinking about journal_start in
> ext3_prepare_write and copy_from_user issue...
> Did you follow that discussion?

Yup.  Chris and I have been admiring the problem for a few months now.

> In the considered scenario not only the application is not
> guaranteed anything till msync(), but all other programs doing regular read()
> may also be fooled about the file content, and this idea surprised me.
> On the other hand, after a write() other programs also see the new content
> without a guarantee that this content corresponds with what is on the disk...

No, read() will see the modified pagecache data immediately, apart from CPU
cache coherency effects.

> > 
> > >  Is it the intended behavior?
> > >  Shouldn't we call the filesystem to fill the hole at the moment of the first
> > >  write access?
> > 
> > That would be a retrograde step - it would be nice to move in the other
> > direction: perform disk allocation at writeback time rather than at write()
> > time, even for regular write() data.  To do that we (probably) need space
> > reservation APIs.  And yes, we perhaps could reserve space in the
> > filesystem when that page is first written to.
> > 
> > But then what would we do if there's no space?  SIGBUS?  SIGSEGV? 
> > Inappropriate.  SIGENOSPC?
> 
> Should the space be allocated on close()?

What effect are you trying to achieve?

> Who will get the signal if nobody accesses the file anymore?

Nobody.  That's the point.  Plus there _is_ no signal defined for this. 
Neither in Linux nor in POSIX.

> I'm also thinking about various shell scripts with redirects to files...

?  I doubt that they're writing files via MAP_SHARED.

  reply	other threads:[~2004-09-05 21:02 UTC|newest]

Thread overview: 17+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2004-09-05  8:01 Q about pagecache data never written to disk Andrey Savochkin
2004-09-05  9:22 ` William Lee Irwin III
2004-09-05 10:52 ` Andrew Morton
2004-09-05 11:43   ` Andrey Savochkin
2004-09-05 21:00     ` Andrew Morton [this message]
2004-09-06  7:06       ` Andrey Savochkin
2004-09-09 12:39       ` Pavel Machek
2004-09-09 13:15         ` Nick Piggin
2004-09-09 13:37           ` Pavel Machek
2004-09-09 13:32             ` Nick Piggin
2004-09-09 17:24               ` William Lee Irwin III
2004-09-09 17:14                 ` Nick Piggin
2004-09-09 17:35                   ` William Lee Irwin III
2004-09-05 16:33   ` William Lee Irwin III
2004-09-06  6:24     ` William Lee Irwin III
2004-09-06  7:02       ` Andrew Morton
2004-09-06 15:12         ` William Lee Irwin III

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20040905140040.58a5fcdc.akpm@osdl.org \
    --to=akpm@osdl.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=saw@saw.sw.com.sg \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.