[PATCH, RFC] check for frozen filesystems in the mmap path

linux-fsdevel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed

* [PATCH, RFC] check for frozen filesystems in the mmap path
@ 2009-04-16 23:45 Eric Sandeen
  2009-04-20 14:55 ` Rik van Riel
  2009-04-21  5:11 ` KOSAKI Motohiro
  0 siblings, 2 replies; 8+ messages in thread
From: Eric Sandeen @ 2009-04-16 23:45 UTC (permalink / raw)
  To: linux-fsdevel, Linux Kernel Mailing List

Stephen Tweedie mentioned to me a concern that while a filesystem
is frozen, data could be dirtied for it via mmap, thereby using
up enough memory that the unfreeze process may be stuck trying
to allocate memory by writing back mmap-dirty data to the frozen
fs.

Christoph suggested maybe a check_frozen in the mmap path to
prevent this; does the sort of thing below look sane?

signed-off-by: Eric Sandeen <sandeen@redhat.com>
---

Index: linux-2.6/mm/memory.c
===================================================================
--- linux-2.6.orig/mm/memory.c
+++ linux-2.6/mm/memory.c
@@ -1944,6 +1944,7 @@ static int do_wp_page(struct mm_struct *
 		 * read-only shared pages can get COWed by
 		 * get_user_pages(.write=1, .force=1).
 		 */
+		vfs_check_frozen(old_page->mapping->host->i_sb, SB_FREEZE_WRITE);
 		if (vma->vm_ops && vma->vm_ops->page_mkwrite) {
 			struct vm_fault vmf;
 			int tmp;
@@ -2660,6 +2661,7 @@ static int __do_fault(struct mm_struct *
 			 * address space wants to know that the page is about
 			 * to become writable
 			 */
+			vfs_check_frozen(vmf.page->mapping->host->i_sb, SB_FREEZE_WRITE);
 			if (vma->vm_ops->page_mkwrite) {
 				int tmp;
 


^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: [PATCH, RFC] check for frozen filesystems in the mmap path
  2009-04-16 23:45 [PATCH, RFC] check for frozen filesystems in the mmap path Eric Sandeen
@ 2009-04-20 14:55 ` Rik van Riel
  2009-04-21  5:11 ` KOSAKI Motohiro
  1 sibling, 0 replies; 8+ messages in thread
From: Rik van Riel @ 2009-04-20 14:55 UTC (permalink / raw)
  To: Eric Sandeen; +Cc: linux-fsdevel, Linux Kernel Mailing List

Eric Sandeen wrote:

> Christoph suggested maybe a check_frozen in the mmap path to
> prevent this; does the sort of thing below look sane?

Yes it does.

> signed-off-by: Eric Sandeen <sandeen@redhat.com>

Acked-by: Rik van Riel <riel@redhat.com>

-- 
All rights reversed.

^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: [PATCH, RFC] check for frozen filesystems in the mmap path
  2009-04-16 23:45 [PATCH, RFC] check for frozen filesystems in the mmap path Eric Sandeen
  2009-04-20 14:55 ` Rik van Riel
@ 2009-04-21  5:11 ` KOSAKI Motohiro
  2009-04-21 15:15   ` Eric Sandeen
  1 sibling, 1 reply; 8+ messages in thread
From: KOSAKI Motohiro @ 2009-04-21  5:11 UTC (permalink / raw)
  To: Eric Sandeen; +Cc: kosaki.motohiro, linux-fsdevel, Linux Kernel Mailing List



> Index: linux-2.6/mm/memory.c
> ===================================================================
> --- linux-2.6.orig/mm/memory.c
> +++ linux-2.6/mm/memory.c
> @@ -1944,6 +1944,7 @@ static int do_wp_page(struct mm_struct *
>  		 * read-only shared pages can get COWed by
>  		 * get_user_pages(.write=1, .force=1).
>  		 */
> +		vfs_check_frozen(old_page->mapping->host->i_sb, SB_FREEZE_WRITE);
>  		if (vma->vm_ops && vma->vm_ops->page_mkwrite) {
>  			struct vm_fault vmf;
>  			int tmp;

it seems strage.

1. it seems to have a race

	CPU0				CPU1
	----------------------------------------------------
	do_wp_page
	 vfs_check_frozen
					ioctl_fsfreeze
					  freeze_bdev
					    __fsync_super
        process touch mem

vfs_check_frozen only wait to unfreeze, but not prevent new
new freeze request starting.


2. this logic kill multi thread application.

this logic mean mmap_sem grabbing until unfreeze.
it mean othrer thread in the same process can't page-fault although
it don't touch frozen-sb.
it seems strange.

^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: [PATCH, RFC] check for frozen filesystems in the mmap path
  2009-04-21  5:11 ` KOSAKI Motohiro
@ 2009-04-21 15:15   ` Eric Sandeen
  2009-04-22  1:35     ` KOSAKI Motohiro
  2009-04-22  4:49     ` KOSAKI Motohiro
  0 siblings, 2 replies; 8+ messages in thread
From: Eric Sandeen @ 2009-04-21 15:15 UTC (permalink / raw)
  To: KOSAKI Motohiro; +Cc: linux-fsdevel, Linux Kernel Mailing List

KOSAKI Motohiro wrote:
> 
>> Index: linux-2.6/mm/memory.c
>> ===================================================================
>> --- linux-2.6.orig/mm/memory.c
>> +++ linux-2.6/mm/memory.c
>> @@ -1944,6 +1944,7 @@ static int do_wp_page(struct mm_struct *
>>  		 * read-only shared pages can get COWed by
>>  		 * get_user_pages(.write=1, .force=1).
>>  		 */
>> +		vfs_check_frozen(old_page->mapping->host->i_sb, SB_FREEZE_WRITE);
>>  		if (vma->vm_ops && vma->vm_ops->page_mkwrite) {
>>  			struct vm_fault vmf;
>>  			int tmp;
> 
> it seems strage.
> 
> 1. it seems to have a race
> 
> 	CPU0				CPU1
> 	----------------------------------------------------
> 	do_wp_page
> 	 vfs_check_frozen
> 					ioctl_fsfreeze
> 					  freeze_bdev
> 					    __fsync_super
>         process touch mem
> 
> vfs_check_frozen only wait to unfreeze, but not prevent new
> new freeze request starting.

Well, I think that is ok.  I don't *think* that any IO can actually
happen to the filesystem even if it gets dirtied via mmap, so if a bit
of mmap-dirtied memory sneaks in before it's actually frozen, I'm not
sure that's really a problem.  The goal was to prevent massive amounts
of memory from getting dirtied, backed by the frozen filesystem.  This
would potentially lead to a situation where the un-freezing thread was
stuck waiting for memory to free up, stuck behind waiting for the
filesystem to unfreeze for writeout, and we can't unfreeze.

> 2. this logic kill multi thread application.
> 
> this logic mean mmap_sem grabbing until unfreeze.
> it mean othrer thread in the same process can't page-fault although
> it don't touch frozen-sb.
> it seems strange.

Hm, I hadn't thought about this ... On the one hand, ->page_mkwrite can
already sleep, though a userspace freeze/unfreeze could potentially take
much much longer.  freeze/unfreeze *should* happen very quickly, but
nothing enforces that.

Do you have any suggestions?

Thanks,
-Eric

^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: [PATCH, RFC] check for frozen filesystems in the mmap path
  2009-04-21 15:15   ` Eric Sandeen
@ 2009-04-22  1:35     ` KOSAKI Motohiro
  2009-04-22  4:49     ` KOSAKI Motohiro
  1 sibling, 0 replies; 8+ messages in thread
From: KOSAKI Motohiro @ 2009-04-22  1:35 UTC (permalink / raw)
  To: Eric Sandeen; +Cc: kosaki.motohiro, linux-fsdevel, Linux Kernel Mailing List

> >> Index: linux-2.6/mm/memory.c
> >> ===================================================================
> >> --- linux-2.6.orig/mm/memory.c
> >> +++ linux-2.6/mm/memory.c
> >> @@ -1944,6 +1944,7 @@ static int do_wp_page(struct mm_struct *
> >>  		 * read-only shared pages can get COWed by
> >>  		 * get_user_pages(.write=1, .force=1).
> >>  		 */
> >> +		vfs_check_frozen(old_page->mapping->host->i_sb, SB_FREEZE_WRITE);
> >>  		if (vma->vm_ops && vma->vm_ops->page_mkwrite) {
> >>  			struct vm_fault vmf;
> >>  			int tmp;
> > 
> > it seems strage.
> > 
> > 1. it seems to have a race
> > 
> > 	CPU0				CPU1
> > 	----------------------------------------------------
> > 	do_wp_page
> > 	 vfs_check_frozen
> > 					ioctl_fsfreeze
> > 					  freeze_bdev
> > 					    __fsync_super
> >         process touch mem
> > 
> > vfs_check_frozen only wait to unfreeze, but not prevent new
> > new freeze request starting.
> 
> Well, I think that is ok.  I don't *think* that any IO can actually
> happen to the filesystem even if it gets dirtied via mmap, so if a bit
> of mmap-dirtied memory sneaks in before it's actually frozen, I'm not
> sure that's really a problem.  The goal was to prevent massive amounts
> of memory from getting dirtied, backed by the frozen filesystem.  This
> would potentially lead to a situation where the un-freezing thread was
> stuck waiting for memory to free up, stuck behind waiting for the
> filesystem to unfreeze for writeout, and we can't unfreeze.

Ah, I see.
one another question.

Why dirty limit don't works properly in this case?



> > 2. this logic kill multi thread application.
> > 
> > this logic mean mmap_sem grabbing until unfreeze.
> > it mean othrer thread in the same process can't page-fault although
> > it don't touch frozen-sb.
> > it seems strange.
> 
> Hm, I hadn't thought about this ... On the one hand, ->page_mkwrite can
> already sleep, though a userspace freeze/unfreeze could potentially take
> much much longer.  freeze/unfreeze *should* happen very quickly, but
> nothing enforces that.
> 
> Do you have any suggestions?







^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: [PATCH, RFC] check for frozen filesystems in the mmap path
  2009-04-21 15:15   ` Eric Sandeen
  2009-04-22  1:35     ` KOSAKI Motohiro
@ 2009-04-22  4:49     ` KOSAKI Motohiro
  2009-04-22  5:01       ` Eric Sandeen
  1 sibling, 1 reply; 8+ messages in thread
From: KOSAKI Motohiro @ 2009-04-22  4:49 UTC (permalink / raw)
  To: Eric Sandeen
  Cc: kosaki.motohiro, linux-fsdevel, Linux Kernel Mailing List,
	Rik van Riel

> > 2. this logic kill multi thread application.
> > 
> > this logic mean mmap_sem grabbing until unfreeze.
> > it mean othrer thread in the same process can't page-fault although
> > it don't touch frozen-sb.
> > it seems strange.
> 
> Hm, I hadn't thought about this ... On the one hand, ->page_mkwrite can
> already sleep, though a userspace freeze/unfreeze could potentially take
> much much longer.  freeze/unfreeze *should* happen very quickly, but
> nothing enforces that.
> 
> Do you have any suggestions?

One more comment.

I read ioctl_fsfreeze() and freeze_bdev(), it call __fsync_super().
Oh, I don't think __fsync_suepr is very quick.

So, page-fault have one unique characteristics.
if page-fault return 0 without pte change, page-fault is occur again soon.
then, if you need long time waiting, I think you can use following technique.

	unlock mmap_sem
	wait long-time
	lock mmap_sem
	goto out;

it cause page-fault counter increment twice unintesionally.
but no problem. fs-freeze is not freqently event.

Am I missing anything?

^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: [PATCH, RFC] check for frozen filesystems in the mmap path
  2009-04-22  4:49     ` KOSAKI Motohiro
@ 2009-04-22  5:01       ` Eric Sandeen
  2009-04-22  5:29         ` KOSAKI Motohiro
  0 siblings, 1 reply; 8+ messages in thread
From: Eric Sandeen @ 2009-04-22  5:01 UTC (permalink / raw)
  To: KOSAKI Motohiro; +Cc: linux-fsdevel, Linux Kernel Mailing List, Rik van Riel

KOSAKI Motohiro wrote:
>>> 2. this logic kill multi thread application.
>>>
>>> this logic mean mmap_sem grabbing until unfreeze.
>>> it mean othrer thread in the same process can't page-fault although
>>> it don't touch frozen-sb.
>>> it seems strange.
>> Hm, I hadn't thought about this ... On the one hand, ->page_mkwrite can
>> already sleep, though a userspace freeze/unfreeze could potentially take
>> much much longer.  freeze/unfreeze *should* happen very quickly, but
>> nothing enforces that.
>>
>> Do you have any suggestions?
> 
> One more comment.
> 
> I read ioctl_fsfreeze() and freeze_bdev(), it call __fsync_super().
> Oh, I don't think __fsync_suepr is very quick.

Well, what I mean is that the filesystem is not intended to be frozen
for long periods of time.  But it's not enforced by any method.

> So, page-fault have one unique characteristics.
> if page-fault return 0 without pte change, page-fault is occur again soon.
> then, if you need long time waiting, I think you can use following technique.
> 
> 	unlock mmap_sem
> 	wait long-time
> 	lock mmap_sem
> 	goto out;
> 
> 
> it cause page-fault counter increment twice unintesionally.
> but no problem. fs-freeze is not freqently event.
> 
> Am I missing anything?

Hm, I'll have to think about that.  This is not my best area.  :)  So do
you mean that if a wait needs to happen for the frozen fs, we can
unlock, do that wait for unfreeze, relock, return early, and come back
again when it is not frozen?

One other thing that I think I just discovered is that nothing is
actually stopping mmap IO even on a frozen filesystem, as long as no
metadata updates are required for the IO... I'm seeing this on xfs
anyway (ext4 tries to update mtime, so that gets stopped on the frozen fs).

Thanks,
-Eric

^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: [PATCH, RFC] check for frozen filesystems in the mmap path
  2009-04-22  5:01       ` Eric Sandeen
@ 2009-04-22  5:29         ` KOSAKI Motohiro
  0 siblings, 0 replies; 8+ messages in thread
From: KOSAKI Motohiro @ 2009-04-22  5:29 UTC (permalink / raw)
  To: Eric Sandeen
  Cc: kosaki.motohiro, linux-fsdevel, Linux Kernel Mailing List,
	Rik van Riel

> KOSAKI Motohiro wrote:
> >>> 2. this logic kill multi thread application.
> >>>
> >>> this logic mean mmap_sem grabbing until unfreeze.
> >>> it mean othrer thread in the same process can't page-fault although
> >>> it don't touch frozen-sb.
> >>> it seems strange.
> >> Hm, I hadn't thought about this ... On the one hand, ->page_mkwrite can
> >> already sleep, though a userspace freeze/unfreeze could potentially take
> >> much much longer.  freeze/unfreeze *should* happen very quickly, but
> >> nothing enforces that.
> >>
> >> Do you have any suggestions?
> > 
> > One more comment.
> > 
> > I read ioctl_fsfreeze() and freeze_bdev(), it call __fsync_super().
> > Oh, I don't think __fsync_suepr is very quick.
> 
> Well, what I mean is that the filesystem is not intended to be frozen
> for long periods of time.  But it's not enforced by any method.
> 
> > So, page-fault have one unique characteristics.
> > if page-fault return 0 without pte change, page-fault is occur again soon.
> > then, if you need long time waiting, I think you can use following technique.
> > 
> > 	unlock mmap_sem
> > 	wait long-time
> > 	lock mmap_sem
> > 	goto out;
> > 
> > 
> > it cause page-fault counter increment twice unintesionally.
> > but no problem. fs-freeze is not freqently event.
> > 
> > Am I missing anything?
> 
> Hm, I'll have to think about that.  This is not my best area.  :)  So do
> you mean that if a wait needs to happen for the frozen fs, we can
> unlock, do that wait for unfreeze, relock, return early, and come back
> again when it is not frozen?

Yes.


> One other thing that I think I just discovered is that nothing is
> actually stopping mmap IO even on a frozen filesystem, as long as no
> metadata updates are required for the IO... I'm seeing this on xfs
> anyway (ext4 tries to update mtime, so that gets stopped on the frozen fs).

I don't understand this issue. oh sorry, I'm not fs expert ;)





^ permalink raw reply	[flat|nested] 8+ messages in thread

end of thread, other threads:[~2009-04-22  5:29 UTC | newest]

Thread overview: 8+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2009-04-16 23:45 [PATCH, RFC] check for frozen filesystems in the mmap path Eric Sandeen
2009-04-20 14:55 ` Rik van Riel
2009-04-21  5:11 ` KOSAKI Motohiro
2009-04-21 15:15   ` Eric Sandeen
2009-04-22  1:35     ` KOSAKI Motohiro
2009-04-22  4:49     ` KOSAKI Motohiro
2009-04-22  5:01       ` Eric Sandeen
2009-04-22  5:29         ` KOSAKI Motohiro

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).