linux-fsdevel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* [PATCH] mm: Improve comment before pagecache_isize_extended()
@ 2014-11-04 11:43 Jan Kara
  2014-11-04 12:20 ` Jan Beulich
  0 siblings, 1 reply; 5+ messages in thread
From: Jan Kara @ 2014-11-04 11:43 UTC (permalink / raw)
  To: Andrew Morton
  Cc: linux-mm, linux-fsdevel, Jan Beulich, Dave Chinner, Jan Kara

Not all filesystems are using i_mutex for serialization - reflect that
in the comment. Also expand the reasoning a bit. It is complex enough
that it deserves more details.

Reported-by: Jan Beulich <JBeulich@suse.com>
Signed-off-by: Jan Kara <jack@suse.cz>
---
 mm/truncate.c | 11 +++++++----
 1 file changed, 7 insertions(+), 4 deletions(-)

BTW Dave has queued patch which removes the
WARN_ON(!mutex_locked(inode->i_mutex)) from the function. That should go to
Linus ASAP.

diff --git a/mm/truncate.c b/mm/truncate.c
index 261eaf6e5a19..b248c0c8dcd1 100644
--- a/mm/truncate.c
+++ b/mm/truncate.c
@@ -743,10 +743,13 @@ EXPORT_SYMBOL(truncate_setsize);
  * changed.
  *
  * The function must be called after i_size is updated so that page fault
- * coming after we unlock the page will already see the new i_size.
- * The function must be called while we still hold i_mutex - this not only
- * makes sure i_size is stable but also that userspace cannot observe new
- * i_size value before we are prepared to store mmap writes at new inode size.
+ * coming after we unlock the page will already see the new i_size.  The caller
+ * must make sure (generally by holding i_mutex but e.g. XFS uses its private
+ * lock) i_size cannot change from the new value while we are called. It must
+ * also make sure userspace cannot observe new i_size value before we are
+ * prepared to store mmap writes upto new inode size (otherwise userspace could
+ * think it stored data via mmap within i_size but they would get zeroed due to
+ * writeback & reclaim because they have no backing blocks).
  */
 void pagecache_isize_extended(struct inode *inode, loff_t from, loff_t to)
 {
-- 
1.8.1.4

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply related	[flat|nested] 5+ messages in thread

* Re: [PATCH] mm: Improve comment before pagecache_isize_extended()
  2014-11-04 11:43 [PATCH] mm: Improve comment before pagecache_isize_extended() Jan Kara
@ 2014-11-04 12:20 ` Jan Beulich
  2014-11-04 15:33   ` Jan Kara
  0 siblings, 1 reply; 5+ messages in thread
From: Jan Beulich @ 2014-11-04 12:20 UTC (permalink / raw)
  To: Jan Kara; +Cc: Dave Chinner, linux-mm, Andrew Morton, linux-fsdevel

>>> On 04.11.14 at 12:43, <"jack@suse.cz".non-mime.internet> wrote:
> --- a/mm/truncate.c
> +++ b/mm/truncate.c
> @@ -743,10 +743,13 @@ EXPORT_SYMBOL(truncate_setsize);
>   * changed.
>   *
>   * The function must be called after i_size is updated so that page fault
> - * coming after we unlock the page will already see the new i_size.
> - * The function must be called while we still hold i_mutex - this not only
> - * makes sure i_size is stable but also that userspace cannot observe new
> - * i_size value before we are prepared to store mmap writes at new inode size.
> + * coming after we unlock the page will already see the new i_size.  The caller
> + * must make sure (generally by holding i_mutex but e.g. XFS uses its private
> + * lock) i_size cannot change from the new value while we are called. It must
> + * also make sure userspace cannot observe new i_size value before we are
> + * prepared to store mmap writes upto new inode size (otherwise userspace could
> + * think it stored data via mmap within i_size but they would get zeroed due to
> + * writeback & reclaim because they have no backing blocks).
>   */
>  void pagecache_isize_extended(struct inode *inode, loff_t from, loff_t to)
>  {

May I suggest that the comment preceding truncate_setsize() also be
updated/removed?

Jan

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: [PATCH] mm: Improve comment before pagecache_isize_extended()
  2014-11-04 12:20 ` Jan Beulich
@ 2014-11-04 15:33   ` Jan Kara
  2014-11-04 16:20     ` Jan Beulich
  0 siblings, 1 reply; 5+ messages in thread
From: Jan Kara @ 2014-11-04 15:33 UTC (permalink / raw)
  To: Jan Beulich
  Cc: Jan Kara, Dave Chinner, linux-mm, Andrew Morton, linux-fsdevel

On Tue 04-11-14 12:20:26, Jan Beulich wrote:
> >>> On 04.11.14 at 12:43, <"jack@suse.cz".non-mime.internet> wrote:
> > --- a/mm/truncate.c
> > +++ b/mm/truncate.c
> > @@ -743,10 +743,13 @@ EXPORT_SYMBOL(truncate_setsize);
> >   * changed.
> >   *
> >   * The function must be called after i_size is updated so that page fault
> > - * coming after we unlock the page will already see the new i_size.
> > - * The function must be called while we still hold i_mutex - this not only
> > - * makes sure i_size is stable but also that userspace cannot observe new
> > - * i_size value before we are prepared to store mmap writes at new inode size.
> > + * coming after we unlock the page will already see the new i_size.  The caller
> > + * must make sure (generally by holding i_mutex but e.g. XFS uses its private
> > + * lock) i_size cannot change from the new value while we are called. It must
> > + * also make sure userspace cannot observe new i_size value before we are
> > + * prepared to store mmap writes upto new inode size (otherwise userspace could
> > + * think it stored data via mmap within i_size but they would get zeroed due to
> > + * writeback & reclaim because they have no backing blocks).
> >   */
> >  void pagecache_isize_extended(struct inode *inode, loff_t from, loff_t to)
> >  {
> 
> May I suggest that the comment preceding truncate_setsize() also be
> updated/removed?
  But that comment is actually still true AFAICT because VFS takes i_mutex
before calling into ->setattr(). So we hold i_mutex in truncate_setsize()
even for XFS.

								Honza
-- 
Jan Kara <jack@suse.cz>
SUSE Labs, CR

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: [PATCH] mm: Improve comment before pagecache_isize_extended()
  2014-11-04 15:33   ` Jan Kara
@ 2014-11-04 16:20     ` Jan Beulich
  2014-11-04 19:37       ` Jan Kara
  0 siblings, 1 reply; 5+ messages in thread
From: Jan Beulich @ 2014-11-04 16:20 UTC (permalink / raw)
  To: Jan Kara; +Cc: Dave Chinner, linux-mm, Andrew Morton, linux-fsdevel

>>> On 04.11.14 at 16:33, <jack@suse.cz> wrote:
> On Tue 04-11-14 12:20:26, Jan Beulich wrote:
>> >>> On 04.11.14 at 12:43, <"jack@suse.cz".non-mime.internet> wrote:
>> > --- a/mm/truncate.c
>> > +++ b/mm/truncate.c
>> > @@ -743,10 +743,13 @@ EXPORT_SYMBOL(truncate_setsize);
>> >   * changed.
>> >   *
>> >   * The function must be called after i_size is updated so that page fault
>> > - * coming after we unlock the page will already see the new i_size.
>> > - * The function must be called while we still hold i_mutex - this not only
>> > - * makes sure i_size is stable but also that userspace cannot observe new
>> > - * i_size value before we are prepared to store mmap writes at new inode 
> size.
>> > + * coming after we unlock the page will already see the new i_size.  The 
> caller
>> > + * must make sure (generally by holding i_mutex but e.g. XFS uses its 
> private
>> > + * lock) i_size cannot change from the new value while we are called. It 
> must
>> > + * also make sure userspace cannot observe new i_size value before we are
>> > + * prepared to store mmap writes upto new inode size (otherwise userspace 
> could
>> > + * think it stored data via mmap within i_size but they would get zeroed 
> due to
>> > + * writeback & reclaim because they have no backing blocks).
>> >   */
>> >  void pagecache_isize_extended(struct inode *inode, loff_t from, loff_t to)
>> >  {
>> 
>> May I suggest that the comment preceding truncate_setsize() also be
>> updated/removed?
>   But that comment is actually still true AFAICT because VFS takes i_mutex
> before calling into ->setattr(). So we hold i_mutex in truncate_setsize()
> even for XFS.

I doubt that, especially in the light of the WARN_ON() that
prompted all this:

[<ffffffff810053fa>] dump_trace+0x7a/0x350
[<ffffffff810050de>] show_stack_log_lvl+0xee/0x150
[<ffffffff810064fc>] show_stack+0x1c/0x50
[<ffffffff8138e4e3>] dump_stack+0x68/0x7d
[<ffffffff81042c82>] warn_slowpath_common+0x82/0xb0
[<ffffffff810d3831>] pagecache_isize_extended+0x121/0x130
[<ffffffff810d4689>] truncate_setsize+0x29/0x50
[<ffffffffa056705f>] xfs_setattr_size+0x12f/0x440 [xfs]
[<ffffffffa055cbf7>] xfs_file_fallocate+0x297/0x310 [xfs]
[<ffffffff81111b59>] do_fallocate+0x169/0x190
[<ffffffff8111206e>] SyS_fallocate+0x4e/0x90
[<ffffffff81392712>] system_call_fastpath+0x12/0x17
[<00007f0e6bdddf45>] 0x7f0e6bdddf45

I.e. truncate_setsize() is being called here without the mutex
held (or else the WARN_ON() wouldn't have got triggered in
the first place).

Jan


^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: [PATCH] mm: Improve comment before pagecache_isize_extended()
  2014-11-04 16:20     ` Jan Beulich
@ 2014-11-04 19:37       ` Jan Kara
  0 siblings, 0 replies; 5+ messages in thread
From: Jan Kara @ 2014-11-04 19:37 UTC (permalink / raw)
  To: Jan Beulich
  Cc: Jan Kara, Dave Chinner, linux-mm, Andrew Morton, linux-fsdevel

On Tue 04-11-14 16:20:08, Jan Beulich wrote:
> >>> On 04.11.14 at 16:33, <jack@suse.cz> wrote:
> > On Tue 04-11-14 12:20:26, Jan Beulich wrote:
> >> >>> On 04.11.14 at 12:43, <"jack@suse.cz".non-mime.internet> wrote:
> >> > --- a/mm/truncate.c
> >> > +++ b/mm/truncate.c
> >> > @@ -743,10 +743,13 @@ EXPORT_SYMBOL(truncate_setsize);
> >> >   * changed.
> >> >   *
> >> >   * The function must be called after i_size is updated so that page fault
> >> > - * coming after we unlock the page will already see the new i_size.
> >> > - * The function must be called while we still hold i_mutex - this not only
> >> > - * makes sure i_size is stable but also that userspace cannot observe new
> >> > - * i_size value before we are prepared to store mmap writes at new inode 
> > size.
> >> > + * coming after we unlock the page will already see the new i_size.  The 
> > caller
> >> > + * must make sure (generally by holding i_mutex but e.g. XFS uses its 
> > private
> >> > + * lock) i_size cannot change from the new value while we are called. It 
> > must
> >> > + * also make sure userspace cannot observe new i_size value before we are
> >> > + * prepared to store mmap writes upto new inode size (otherwise userspace 
> > could
> >> > + * think it stored data via mmap within i_size but they would get zeroed 
> > due to
> >> > + * writeback & reclaim because they have no backing blocks).
> >> >   */
> >> >  void pagecache_isize_extended(struct inode *inode, loff_t from, loff_t to)
> >> >  {
> >> 
> >> May I suggest that the comment preceding truncate_setsize() also be
> >> updated/removed?
> >   But that comment is actually still true AFAICT because VFS takes i_mutex
> > before calling into ->setattr(). So we hold i_mutex in truncate_setsize()
> > even for XFS.
> 
> I doubt that, especially in the light of the WARN_ON() that
> prompted all this:
> 
> [<ffffffff810053fa>] dump_trace+0x7a/0x350
> [<ffffffff810050de>] show_stack_log_lvl+0xee/0x150
> [<ffffffff810064fc>] show_stack+0x1c/0x50
> [<ffffffff8138e4e3>] dump_stack+0x68/0x7d
> [<ffffffff81042c82>] warn_slowpath_common+0x82/0xb0
> [<ffffffff810d3831>] pagecache_isize_extended+0x121/0x130
> [<ffffffff810d4689>] truncate_setsize+0x29/0x50
> [<ffffffffa056705f>] xfs_setattr_size+0x12f/0x440 [xfs]
> [<ffffffffa055cbf7>] xfs_file_fallocate+0x297/0x310 [xfs]
> [<ffffffff81111b59>] do_fallocate+0x169/0x190
> [<ffffffff8111206e>] SyS_fallocate+0x4e/0x90
> [<ffffffff81392712>] system_call_fastpath+0x12/0x17
> [<00007f0e6bdddf45>] 0x7f0e6bdddf45
> 
> I.e. truncate_setsize() is being called here without the mutex
> held (or else the WARN_ON() wouldn't have got triggered in
> the first place).
  Ah, OK, I was thinking about standard truncate path and didn't notice
that xfs_setattr_size() can get called also from the fallocate code.
I'll fix that comment as well.

								Honza
-- 
Jan Kara <jack@suse.cz>
SUSE Labs, CR

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 5+ messages in thread

end of thread, other threads:[~2014-11-04 19:37 UTC | newest]

Thread overview: 5+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2014-11-04 11:43 [PATCH] mm: Improve comment before pagecache_isize_extended() Jan Kara
2014-11-04 12:20 ` Jan Beulich
2014-11-04 15:33   ` Jan Kara
2014-11-04 16:20     ` Jan Beulich
2014-11-04 19:37       ` Jan Kara

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).