All of lore.kernel.org
 help / color / mirror / Atom feed
From: Peter Zijlstra <a.p.zijlstra@chello.nl>
To: Nick Piggin <nickpiggin@yahoo.com.au>
Cc: Ying Han <yinghan@google.com>, Jan Kara <jack@suse.cz>,
	Linus Torvalds <torvalds@linux-foundation.org>,
	Andrew Morton <akpm@linux-foundation.org>,
	linux-kernel <linux-kernel@vger.kernel.org>,
	linux-mm <linux-mm@kvack.org>,
	guichaz@gmail.com, Alex Khesin <alexk@google.com>,
	Mike Waychison <mikew@google.com>,
	Rohit Seth <rohitseth@google.com>
Subject: Re: ftruncate-mmap: pages are lost after writing to mmaped file.
Date: Thu, 19 Mar 2009 17:16:01 +0100	[thread overview]
Message-ID: <1237479361.24626.23.camel@twins> (raw)
In-Reply-To: <200903200248.22623.nickpiggin@yahoo.com.au>

On Fri, 2009-03-20 at 02:48 +1100, Nick Piggin wrote:
> On Thursday 19 March 2009 10:54:33 Ying Han wrote:
> > On Wed, Mar 18, 2009 at 4:36 PM, Linus Torvalds
> >
> > <torvalds@linux-foundation.org> wrote:
> > > On Wed, 18 Mar 2009, Ying Han wrote:
> > >> > Can you say what filesystem, and what mount-flags you use? Iirc, last
> > >> > time we had MAP_SHARED lost writes it was at least partly triggered by
> > >> > the filesystem doing its own flushing independently of the VM (ie ext3
> > >> > with "data=journal", I think), so that kind of thing does tend to
> > >> > matter.
> > >>
> > >> /etc/fstab
> > >> "/dev/hda1 / ext2 defaults 1 0"
> > >
> > > Sadly, /etc/fstab is not necessarily accurate for the root filesystem. At
> > > least Fedora will ignore the flags in it.
> > >
> > > What does /proc/mounts say? That should be a more reliable indication of
> > > what the kernel actually does.
> >
> > "/dev/root / ext2 rw,errors=continue 0 0"
> 
> No luck with finding the problem yet.
> 
> But I think we do have a race in __set_page_dirty_buffers():
> 
> The page may not have buffers between the mapping->private_lock
> critical section and the __set_page_dirty call there. So between
> them, another thread might do a create_empty_buffers which can
> see !PageDirty and thus it will create clean buffers. The page
> will get dirtied by the original thread, but if the buffers are
> clean it can be cleaned without writing out buffers.
> 
> Holding mapping->private_lock over the __set_page_dirty should
> fix it, although I guess you'd want to release it before calling
> __mark_inode_dirty so as not to put inode_lock under there. I
> have a patch for this if it sounds reasonable.

When I first did those dirty tracking patches someone (I think Andrew)
commented no the fact that I did set_page_dirty() under one of these
inner locks..

/me frobs around in archives for a bit..

 - fs/buffers.c try_to_free_buffers(): remove clear_page_dirty() from under
   ->private_lock. This seems to be save, since ->private_lock is used to
   serialize access to the buffers, not the page itself.

Hmm, that's a slightly different issue...

But yeah, your scenario makes heaps of sense.

Can't we do the TestSetPageDirty() before private_lock ? It's currently
done before tree_lock as well.



WARNING: multiple messages have this Message-ID (diff)
From: Peter Zijlstra <a.p.zijlstra@chello.nl>
To: Nick Piggin <nickpiggin@yahoo.com.au>
Cc: Ying Han <yinghan@google.com>, Jan Kara <jack@suse.cz>,
	Linus Torvalds <torvalds@linux-foundation.org>,
	Andrew Morton <akpm@linux-foundation.org>,
	linux-kernel <linux-kernel@vger.kernel.org>,
	linux-mm <linux-mm@kvack.org>,
	guichaz@gmail.com, Alex Khesin <alexk@google.com>,
	Mike Waychison <mikew@google.com>,
	Rohit Seth <rohitseth@google.com>
Subject: Re: ftruncate-mmap: pages are lost after writing to mmaped file.
Date: Thu, 19 Mar 2009 17:16:01 +0100	[thread overview]
Message-ID: <1237479361.24626.23.camel@twins> (raw)
In-Reply-To: <200903200248.22623.nickpiggin@yahoo.com.au>

On Fri, 2009-03-20 at 02:48 +1100, Nick Piggin wrote:
> On Thursday 19 March 2009 10:54:33 Ying Han wrote:
> > On Wed, Mar 18, 2009 at 4:36 PM, Linus Torvalds
> >
> > <torvalds@linux-foundation.org> wrote:
> > > On Wed, 18 Mar 2009, Ying Han wrote:
> > >> > Can you say what filesystem, and what mount-flags you use? Iirc, last
> > >> > time we had MAP_SHARED lost writes it was at least partly triggered by
> > >> > the filesystem doing its own flushing independently of the VM (ie ext3
> > >> > with "data=journal", I think), so that kind of thing does tend to
> > >> > matter.
> > >>
> > >> /etc/fstab
> > >> "/dev/hda1 / ext2 defaults 1 0"
> > >
> > > Sadly, /etc/fstab is not necessarily accurate for the root filesystem. At
> > > least Fedora will ignore the flags in it.
> > >
> > > What does /proc/mounts say? That should be a more reliable indication of
> > > what the kernel actually does.
> >
> > "/dev/root / ext2 rw,errors=continue 0 0"
> 
> No luck with finding the problem yet.
> 
> But I think we do have a race in __set_page_dirty_buffers():
> 
> The page may not have buffers between the mapping->private_lock
> critical section and the __set_page_dirty call there. So between
> them, another thread might do a create_empty_buffers which can
> see !PageDirty and thus it will create clean buffers. The page
> will get dirtied by the original thread, but if the buffers are
> clean it can be cleaned without writing out buffers.
> 
> Holding mapping->private_lock over the __set_page_dirty should
> fix it, although I guess you'd want to release it before calling
> __mark_inode_dirty so as not to put inode_lock under there. I
> have a patch for this if it sounds reasonable.

When I first did those dirty tracking patches someone (I think Andrew)
commented no the fact that I did set_page_dirty() under one of these
inner locks..

/me frobs around in archives for a bit..

 - fs/buffers.c try_to_free_buffers(): remove clear_page_dirty() from under
   ->private_lock. This seems to be save, since ->private_lock is used to
   serialize access to the buffers, not the page itself.

Hmm, that's a slightly different issue...

But yeah, your scenario makes heaps of sense.

Can't we do the TestSetPageDirty() before private_lock ? It's currently
done before tree_lock as well.


--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

  reply	other threads:[~2009-03-19 16:16 UTC|newest]

Thread overview: 121+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2009-03-18 19:44 ftruncate-mmap: pages are lost after writing to mmaped file Ying Han
2009-03-18 19:44 ` Ying Han
2009-03-18 22:11 ` Andrew Morton
2009-03-18 22:11   ` Andrew Morton
2009-03-18 22:40   ` Linus Torvalds
2009-03-18 22:40     ` Linus Torvalds
2009-03-18 23:18     ` Ying Han
2009-03-18 23:18       ` Ying Han
2009-03-18 23:36       ` Linus Torvalds
2009-03-18 23:36         ` Linus Torvalds
2009-03-18 23:54         ` Ying Han
2009-03-18 23:54           ` Ying Han
2009-03-19 15:48           ` Nick Piggin
2009-03-19 15:48             ` Nick Piggin
2009-03-19 16:16             ` Peter Zijlstra [this message]
2009-03-19 16:16               ` Peter Zijlstra
2009-03-19 16:36               ` Nick Piggin
2009-03-19 16:36                 ` Nick Piggin
2009-03-19 16:20             ` Linus Torvalds
2009-03-19 16:20               ` Linus Torvalds
2009-03-19 16:34               ` Nick Piggin
2009-03-19 16:34                 ` Nick Piggin
2009-03-19 16:51                 ` Linus Torvalds
2009-03-19 16:51                   ` Linus Torvalds
2009-03-19 17:03                   ` Jan Kara
2009-03-19 17:03                     ` Jan Kara
2009-03-19 17:06                     ` Jan Kara
2009-03-19 17:06                       ` Jan Kara
2009-03-19 20:05                     ` Linus Torvalds
2009-03-19 20:05                       ` Linus Torvalds
2009-03-19 20:21                   ` Linus Torvalds
2009-03-19 20:21                     ` Linus Torvalds
2009-03-19 21:17                     ` Ying Han
2009-03-19 21:17                       ` Ying Han
2009-03-19 22:16                     ` Jan Kara
2009-03-19 22:16                       ` Jan Kara
2009-03-19 16:46             ` Jan Kara
2009-03-19 16:46               ` Jan Kara
2009-03-24  7:44               ` Nick Piggin
2009-03-24  7:44                 ` Nick Piggin
2009-03-24 10:27                 ` Nick Piggin
2009-03-24 10:27                   ` Nick Piggin
2009-03-24 10:32                 ` Andrew Morton
2009-03-24 10:32                   ` Andrew Morton
2009-03-24 15:35                   ` Nick Piggin
2009-03-24 15:35                     ` Nick Piggin
2009-03-26 18:29                     ` Jan Kara
2009-03-26 18:29                       ` Jan Kara
2009-03-26  0:03                   ` Ying Han
2009-03-26  0:03                     ` Ying Han
2009-03-24 12:39                 ` Jan Kara
2009-03-24 12:39                   ` Jan Kara
2009-03-24 12:55                   ` Jan Kara
2009-03-24 12:55                     ` Jan Kara
2009-03-24 13:26                     ` Jan Kara
2009-03-24 13:26                       ` Jan Kara
2009-03-24 14:01                       ` Chris Mason
2009-03-24 14:01                         ` Chris Mason
2009-03-24 14:07                         ` Jan Kara
2009-03-24 14:07                           ` Jan Kara
2009-03-26  8:18                           ` Aneesh Kumar K.V
2009-03-26  8:18                             ` Aneesh Kumar K.V
2009-03-24 14:30                       ` Nick Piggin
2009-03-24 14:30                         ` Nick Piggin
2009-03-24 14:47                         ` Jan Kara
2009-03-24 14:47                           ` Jan Kara
2009-03-24 14:56                           ` Peter Zijlstra
2009-03-24 14:56                             ` Peter Zijlstra
2009-03-24 15:29                             ` Jan Kara
2009-03-24 15:29                               ` Jan Kara
2009-03-24 20:14                               ` OGAWA Hirofumi
2009-03-24 20:14                                 ` OGAWA Hirofumi
2009-03-26  8:47                               ` Aneesh Kumar K.V
2009-03-26  8:47                                 ` Aneesh Kumar K.V
2009-03-26 11:37                                 ` Jan Kara
2009-03-26 11:37                                   ` Jan Kara
2009-03-26 23:02                                 ` Linus Torvalds
2009-03-26 23:02                                   ` Linus Torvalds
2009-03-24 15:03                           ` Nick Piggin
2009-03-24 15:03                             ` Nick Piggin
2009-03-24 15:48                             ` Jan Kara
2009-03-24 15:48                               ` Jan Kara
2009-03-24 17:35                               ` Jan Kara
2009-03-24 17:35                                 ` Jan Kara
2009-03-24 17:35                                 ` Jan Kara
2009-04-01 22:36                                 ` Ying Han
2009-04-01 22:36                                   ` Ying Han
2009-04-02 10:11                                   ` Jan Kara
2009-04-02 10:11                                     ` Jan Kara
2009-04-02 11:24                                   ` Nick Piggin
2009-04-02 11:24                                     ` Nick Piggin
2009-04-02 11:34                                     ` Jan Kara
2009-04-02 11:34                                       ` Jan Kara
2009-04-02 15:51                                       ` Nick Piggin
2009-04-02 15:51                                         ` Nick Piggin
2009-04-02 17:44                                         ` Ying Han
2009-04-02 17:44                                           ` Ying Han
2009-04-02 22:52                                           ` Ying Han
2009-04-02 22:52                                             ` Ying Han
2009-04-02 23:39                                             ` Jan Kara
2009-04-02 23:39                                               ` Jan Kara
2009-04-03  0:25                                               ` Ying Han
2009-04-03  0:25                                                 ` Ying Han
2009-04-03  1:29                                               ` Ying Han
2009-04-03  1:29                                                 ` Ying Han
2009-04-03  9:41                                                 ` Jan Kara
2009-04-03  9:41                                                   ` Jan Kara
2009-04-03 21:34                                                   ` Ying Han
2009-04-03 21:34                                                     ` Ying Han
2009-04-03  0:13                                     ` Ying Han
2009-04-03  0:13                                       ` Ying Han
2009-03-27 20:35                 ` Ying Han
2009-03-27 20:35                   ` Ying Han
2009-03-20  0:34     ` Ying Han
2009-03-20  0:34       ` Ying Han
2009-03-20  0:49       ` Linus Torvalds
2009-03-20  0:49         ` Linus Torvalds
2009-03-20  7:00         ` Ying Han
2009-03-20  7:00           ` Ying Han
2009-03-25 23:15     ` Ying Han
2009-03-25 23:15       ` Ying Han

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=1237479361.24626.23.camel@twins \
    --to=a.p.zijlstra@chello.nl \
    --cc=akpm@linux-foundation.org \
    --cc=alexk@google.com \
    --cc=guichaz@gmail.com \
    --cc=jack@suse.cz \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-mm@kvack.org \
    --cc=mikew@google.com \
    --cc=nickpiggin@yahoo.com.au \
    --cc=rohitseth@google.com \
    --cc=torvalds@linux-foundation.org \
    --cc=yinghan@google.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.