Re: [PATCH 02/11] vfs: Add better VFS support for page_mkwrite when blocksize < pagesize

linux-mm.kvack.org archive mirror
 help / color / mirror / Atom feed

From: Nick Piggin <npiggin@suse.de>
To: Jan Kara <jack@suse.cz>, OM@suse.de
Cc: LKML <linux-kernel@vger.kernel.org>,
	linux-mm@kvack.org, linux-fsdevel@vger.kernel.org
Subject: Re: [PATCH 02/11] vfs: Add better VFS support for page_mkwrite when blocksize < pagesize
Date: Mon, 29 Jun 2009 07:54:21 +0200	[thread overview]
Message-ID: <20090629055421.GG11450@wotan.suse.de> (raw)
In-Reply-To: <20090626160851.GA22335@duck.suse.cz>

On Fri, Jun 26, 2009 at 06:08:51PM +0200, Jan Kara wrote:
> On Fri 26-06-09 14:55:05, Nick Piggin wrote:
> > On Fri, Jun 26, 2009 at 02:21:41PM +0200, Jan Kara wrote:
> > >   So if you have any idea how to better solve this, you are welcome ;).
> > 
> > Ah thanks, the write(2) case I missed. That does get complex to
> > do with the page lock.
> > 
> > I agree with the semantics you are aiming for, and I agree we should
> > not try to allocate blocks when extending i_size.
> > 
> > We actually could update i_size after dropping the page lock in
> > these paths. That would give a window where we can page_mkclean
> > the old partial page before the i_size update.
>   Yes, that would be fine and make things simpler...

Hopefully.

 
> > However this does actually require that we remove the partial-page
> > zeroing that writepage does. I think it does it in order to attempt
> > to write zeroes into the fs even if the app does mmaped writes
> > past i_size... but it is pretty dumb anyway really because the
> > behaviour is undefined anyway so there is no problem if weird
> > stuff gets written there (it should be zeroed out when extending
> > the file anyway), and also there is nothing to prevent races of
> > subsequent mmapped writes before the DMA completes.
>   We definitely don't zero out the last page when extending the file. But
> if we do it, we should be fine as you write. I'll try to write a patch...
> (I'm on vacation next week though so probably after that).

What I mean is that as of today, write(2) is required to hold page lock
of the page it is operating on if it writes anything past i_size. It
must hold that lock until i_size is extended to include the new data.
If it does not hold the lock, then eg. block_write_full_page can zero
out that data incorrectly

        /*
         * The page straddles i_size.  It must be zeroed out on each and every
         * writepage invokation because it may be mmapped.  "A file is mapped
         * in multiples of the page size.  For a file that is not a multiple of
         * the  page size, the remaining memory is zeroed when mapped, and
         * writes to that region are not written out to the file."
         */

But I argue this is bogus anyway because it is completely racy, and
it should be undefined behaviour anyway. So I think it would be fine
to remove it.

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

next prev parent reply	other threads:[~2009-06-29  5:52 UTC|newest]

Thread overview: 39+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2009-06-15 17:59 [PATCH 0/10] Fix page_mkwrite() for blocksize < pagesize (version 3) Jan Kara
2009-06-15 17:59 ` [PATCH 01/11] ext3: Get rid of extenddisksize parameter of ext3_get_blocks_handle() Jan Kara
2009-06-17 10:28   ` Nick Piggin
2009-06-17 11:49     ` Jan Kara
2009-06-15 17:59 ` [PATCH 02/11] vfs: Add better VFS support for page_mkwrite when blocksize < pagesize Jan Kara
2009-06-25 16:17   ` Nick Piggin
2009-06-25 16:43     ` Nick Piggin
2009-06-25 17:47     ` Christoph Hellwig
2009-06-26  8:42       ` Nick Piggin
2009-06-30 17:37         ` Christoph Hellwig
2009-07-02  7:22           ` Nick Piggin
2009-07-04 15:18             ` Christoph Hellwig
2009-07-06  9:08               ` Nick Piggin
2009-07-06 10:35                 ` Christoph Hellwig
2009-07-06 11:49                   ` Nick Piggin
2009-06-26 12:21     ` Jan Kara
2009-06-26 12:55       ` Nick Piggin
2009-06-26 16:08         ` Jan Kara
2009-06-29  5:54           ` Nick Piggin [this message]
2009-06-15 17:59 ` [PATCH 03/11] ext2: Allocate space for mmaped file on page fault Jan Kara
2009-06-15 17:59 ` [PATCH 04/11] ext4: Make sure blocks are properly allocated under mmaped page even when blocksize < pagesize Jan Kara
2009-06-15 17:59 ` [PATCH 05/11] ext3: Allocate space for mmaped file on page fault Jan Kara
2009-06-15 17:59 ` [PATCH 06/11] vfs: Implement generic per-cpu counters for delayed allocation Jan Kara
2009-06-15 17:59 ` [PATCH 07/11] vfs: Unmap underlying metadata of new data buffers only when buffer is mapped Jan Kara
2009-06-17 10:35   ` Nick Piggin
2009-06-17 12:05     ` Jan Kara
2009-06-17 13:53       ` Nick Piggin
2009-06-18 12:00         ` Theodore Tso
2009-06-18 11:51   ` OGAWA Hirofumi
2009-06-15 17:59 ` [PATCH 08/11] fs: Don't clear dirty bits in block_write_full_page() Jan Kara
2009-06-15 17:59 ` [PATCH 09/11] vfs: Export wakeup_pdflush Jan Kara
2009-06-15 17:59 ` [PATCH 10/11] ext3: Implement delayed allocation on page_mkwrite time Jan Kara
2009-06-15 18:02 ` [PATCH 0/10] Fix page_mkwrite() for blocksize < pagesize (version 3) Jan Kara
2009-06-15 18:17 ` Aneesh Kumar K.V
2009-06-16 10:28   ` Jan Kara
2009-06-16 14:34 ` Christoph Hellwig
2009-06-16 14:42   ` Jan Kara
2009-06-30 17:44     ` Christoph Hellwig
2009-07-01 10:29       ` Aneesh Kumar K.V

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20090629055421.GG11450@wotan.suse.de \
    --to=npiggin@suse.de \
    --cc=OM@suse.de \
    --cc=jack@suse.cz \
    --cc=linux-fsdevel@vger.kernel.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-mm@kvack.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link

Be sure your reply has a Subject: header at the top and a blank line before the message body.

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).